About the Chemical Database Service

Historical Introduction

The quantity and diversity of chemical information has increased rapidly over the years.

A committee was set up in 1992 by the Engineering and Physical Sciences Research Council (EPSRC) to review the whole issue of chemical database provision for the UK academic community. It concluded that there were a number of distinct advantages in providing a dedicated central facility. The advantages included cost sharing, ready availability of specialist information, access to large systems which would be beyond the means of individual users, support and user training.

The Chemical Database Service at Daresbury Laboratory, which was already providing access to a range of mainly crystallographic data, won the tender to provide this service. It was then able to extend its portfolio to include spectroscopy, available chemicals and synthetic organic chemistry databases.

The Current Service

The CDS is a national facility dedicated to the provision of centralised chemical information for the UK academic community. The Darebury Laboratory has excellent computer network access, and the CDS group is now part of the Computational Science and Engineering Department, which means experienced support staff (system programmers, network support etc.) are readily available to the Service.

The specific aim of the Service is to ensure that the growing body of information from chemical research is conveniently accessible to its target community. This is done by providing on-line access to up to date, comprehensive, high quality chemical databases and ancillary facilities. It also provides help, advice and training.

Users are supported by post-doctoral chemists. They work as a team and have responsibilities for and expertise in the various database areas.

The CDS has a Management Advisory Panel (CDS MAP). The roles of the CDS MAP include service policy formulation and providing an interface to the academic community.

Eligibility

The Service is available to all UK "academics". A UK Academic is defined as someone employed by, or working at a UK University. This includes lecturers, students, post-doc workers, technicians, librarians, information officers, visiting academics for duration of the visit etc. A key point is that no commercial use is made of the Service.

Registration for the Service is straightforward and there is now a new Express Registration system. Use of the Shibboleth mechanism (in line with access to Web of Knowledge, MIMAS, JISCmail etc.) has recently been implemented.

There are currently approaching 4,500 individually registered users. In addition special short term access is regularly made available for specific online sessions organised at university sites as part of their local training schedules.

The Databases

The databases are available in the following areas:

Structures

The CDS offers a comprehensive collection of organic, organo-metallic, inorganic, metals, alloys, and intermetallic crystal structure data. The databases comprise the Cambridge Structural Database (including ConQuest, VISTA, Mercury, Mogul & IsoStar), ICSD, CrystMet and the NIST Crystal Data File.

They are regularly updated. ICSD is now available via its own Web interface. CrystalWeb is a global web interface, which allows search, display and coordinate output for all the crystallographic datasets.

Spectroscopy

Spectroscopic facilities on the CDS are provided by the SpecInfo and I-Lab systems.

The SpecInfo system is produced by Chemical Concepts and allows both spectrum prediction and searching. It uses the SpecSurf web browser-based interface, which makes drawing structures and viewing hit lists straightforward for the user. Its knowledge base contains 1H, 13C, 15N, 17O, 19F and 31P NMR spectra, Mass spectra and IR spectra datasets.

I-Lab is produced by Advanced Chemistry Development, Inc. It is an online system which provides instant user access to spectroscopy information, compound name generation and physical property prediction facilities. It is available via both client/server and web browser interfaces. Its spectroscopic coverage in many ways mirrors the NMR data available via SpecInfo, but its proton NMR prediction capabilities are particularly impressive.

Thermophysical Data

Thermophysical data facilities on the CDS are provided via the Detherm database system and the I-Lab package. Both systems are available via both client/server and web browser interfaces.

Detherm is produced by DECHEMA eV (the German Society of Chemical Engineering and Biotechnology), and is also of particular value to the Chemical Engineering community.

Detherm is one of the world's largest thermophysical property databases of pure compounds and compound mixtures. It contains ~5.88 Million data sets for around 127,000 systems (over 26,500 pure substances and 101,300 mixtures) covering more than 500 properties fields. For instance, in the field of vapour-liquid-equilibrium data, it contains more than 95% of experimentally data measurements published worldwide.

The I-Lab package has been mentioned in the Spectroscopy section above, but it also includes a physical properties component. It allows you to search its database for pKa (about 16,000 structures), LogP (over 18,400 structures), and Solubility (over 5,000 compounds). Physical property for input molecules can be predicted for pKa, LogP, LogD, aqueous solubility, boiling point/Vapour pressure/Enthalpy of vaporization prediction adsorption coefficient/Bioconcentration factor, etc.

Organic Chemistry

Until May 2007 the Service was able to provide a wide range of databases covering the areas of synthetic organic chemistry, chiral separation, chemical procurement and screening compounds. These components were withdrawn following a decision by an EPSRC Review Panel in 2006. This decision was considered bizarre by many in the community.

Efforts are being made to restore some of the key lost facilities. We currently provide access to demo systems allowing access to a limited range of available chemical and screening compound information. We will also provide access to the SPRESIweb, a comprehensive database consisting of chemical structures and reaction information produced by ChemInform, until April 2010 as least.

Links to Further Details

Structures, Spectroscopy, Physical Chemistry, SPRESIweb, Presentation Material & the EPSRC Decision

Further Information

A book has been produced covering developments in computerised chemical information around the end of the twentieth century. This includes a chapter written about the CDS.

"The United Kingdom Chemical Database Service: CDS", Bob McMeeking & Dave Fletcher, in Cheminformatics Developments: History, Reviews and Current Research (Ed. J. H. Noordik), IOS Press, Amsterdam, Chapter 2, pp 37-67, 2004.


The CDS chapter covers the history of the Service from its early phase to its present more extensive data portfolio. It also gives a good snapshot description of the Service until recently.

The publishers have released the book "Cheminformatics Developments" as the first volume of a new journal entitled "Cheminformatics". The first volume is available online, and can now be accessed via the link below. A number of the articles (including the one about the CDS) are available free of charge without subscription.

Link to CDS review article in "Cheminformatics"

There is also further material about the current state of the Service. This includes a comprehensive Powerpoint demonstration.

Link to Presentation Material

Conclusion

The UK academic user community continues to derive a great deal of value from easy access to high quality, comprehensive, centralised chemistry databases The benefits include considerable savings in time, effort and cost, and this has promoted higher productivity and efficiency. Also new modes of searching have been possible, which would have been time consuming or impossible using printed literature. These result in novel leads, which typically indicate new directions for research projects.