Research IT

Making Large Databases Easily Available

Research IT have recently collaborated with Dr. Mabel Sánchez Barrioluengo in the Alliance Manchester Business School (AMBS) to make a large patent information database easily available to researchers in AMBS and across the University.

Dr. Sánchez Barrioluengo is an AMBS Presidential Fellows with an interest in understanding how changes in the nature of work are affecting "Industry 4.0" as well as the changing nature of work and its relationship with innovation. Patents provide a great deal of information about such changes and the PATSTAT database is regarded as a standard in the field of patent intelligence and statistics.

Colleagues in AMBS had purchased the PATSTAT database from the European Patent Office (EPO) in order that it could be used across the School. However, PATSTAT is distributed as a large set of CSV files (over 200GB) and the data is difficult for researchers to use in this format. Research IT were approached to see if they could help to improve the usability of the database.

Theresa Teng and Richard Hoskins from Research IT liaised with the ITS Windows and Virtualization team in ITS, and Dr Sánchez Barrioluengo, to configure a virtual machine to hold a Microsoft SQL Server containing the data from PATSTAT, build the database, import the data and to ensure that the data was regularly backed up, secure and accessible.

There are some great benefits of having a networked database such as the ability to share the information with any other interested researchers across the campus and it is easy to tune and refine the database. For instance the RSEs can build views and indexes in order to improve the ease of data access and the speed of queries. Due to the database being hosted centrally, backups, restores and maintenance are handled using established processes.

Having PATSTAT installed on UoM servers will allow Dr. Sánchez Barrioluengo to easily enlarge the portfolio of her research activities such as linking skills and employment with the innovation capacity of firms under the Digital Economy. It will also allow researchers to carry out sophisticated statistical analyses of bibliographical and legal status patent data, and also to merge other business data sources (WERS , ABS/ABI, BSD or FAME) with PATSTAT.

If you are interested in accessing the PATSTAT database, then please contact Theresa Teng or Richard Hoskins in Research IT. Currently the database service is in a test state, containing only patent titles and metadata up to 2016. However, there may be scope in the future to load full patent abstracts, should sufficient demand exist.