Tuesday, March 31, 2015

What is polymer informatics?

Polymer informatics combines polymer chemistry, computer science and information science. The idea of polymer informatics is to advance the design, analysis and understanding of polymer systems. A polymer informatician probes and employs insights from the systematic study of computational methods, knowledge acquisition strategies and pattern recognition algorithms to develop digitalized solutions for polymer research & engineering.

Like the related disciplines cheminformatics and bioinformatics, polymer informatics is an interdisciplinary field. It is an emerging discipline that should not be considered a subdiscipline of cheminformatics. Cheminformatics “deals” with small molecules, i.e. molecules with a confined structure whose composition and atom connectivity can precisely be represented by a molecular graph and an associated connection table. The subject of polymer informatics is the rational management of macromolecules—chain-like molecules consisting of one or more structural repeat units (SRUs). Regular single- and multi-strand polymers and copolymers are the key ingredients of polymer systems; for example blends and composites. Cheminformatics and polymer informatics are mostly design-oriented. In contrast, bioinformatics pays particular attention to the sequence patterns (typically nucleic acid and protein sequences) of biomacromolecules within the context of biological processes and gene-based drug discovery.

Critical for the unambiguous description, storage, search and modeling of polymer systems is the adoption of recommended, agreed-upon nomenclatures and structural representation systems. An IUPAC recommendation for organic polymers exists and provides a structure-based nomenclature for regular single-strand polymers [1]. The chemical Sgroup approach serves as a polymer abstraction concept [2]. The Polymer Markup Language (PLM) utilizes XML technology to manage polymer information [3]. The user-friendly CurlySMILES language supports structural encoding of macromolecules as annotated SMILES notation [4,5], CurlySMILES is currently enhanced for the encoding of multi-stand polymers and copolymers. Further, CurlySMILES provides a syntax to represent complex systems such as polymer assemblies, polymer solutions, doped polymers and nanocomposites in a compact single line notation.

A recent thesis on automatic polymer data evaluation in combination with the Polymer Informatics Knowledge System (PIKS) constitutes an excellent source to familiarize oneself with solutions and challenges in computer-assisted polymer research [6].

The present Polymer Informatics blog is intended as a platform to discuss diverse aspects of integrating polymer science with data management technologies and computational disciplines.

[1] J. Kahovec, R. B. Fox and K. Hatada: Nomenclature of regular single-strand organic polymers. Pure Appl. Chem 2002, 74 (10), pp. 1921-1956.
PDF: pac.iupac.org/publications/pac/pdf/2002/pdf/7410x1921.pdf.
[2] A. J. Gushurst, J. G. Nourse, W. D. Hounshell, B. A. Leland and D. G. Raich: The substance module: the representation, storage, and searching of complex structures. J. Chem. Inf. Comput. Sci. 1991, 31 (4), pp. 447-454. DOI: 10.1021/ci00004a003.
[3] N. Adams, J. Winter, P. Murray-Rust and H. S. Rzepa: Chemical Markup, XML and the World-Wide Web. 8. Polymer Markup Language. J. Chem. Inf. Model 2008, 48, pp. 2118-2128. DOI: 10.1021/ci8002123.
[4] A. Drefahl: CurlySMILES: a chemical language to customize and annotate encodings of molecular and nonodevice structures. J. Cheminform. 2011, 3:1.  DOI; 10.1186/1758-2946-3-1.
[5] A. Drefahl: CurlySMILES encoding of homopolymers.
Internet: www.axeleratio.com/csm/encoding/polymers/homopolymers.htm.
[6] N. W. England: Automatic analysis and validation of open polymer data. Dissertation submitted for the degree of Doctor of Philosophy. University of Cambridge, United Kingdom, 2011Internethttps://www.repository.cam.ac.uk/handle/1810/237228.