How to convert a list of CAS numbers and IUPAC names to SMILES at once [Free software]

2018/12/9

There are many sites that output SMILES when you enter the CAS number (CAS No.) or IUPAC Name for one compound, but if there are many lists, enter them one by one to convert. Is impossible.In this article, I will show you how to batch convert a list of compound notations contained in thousands of units to SMILES or InChI Key.

What is ChemCell?

ChemCell is a macro that allows Microsoft Excel to convert chemical names and CAS numbers to SMILES strings.The convertible compound notations are as follows.

  • CAS No.
  • SMILES
  • InChI Key
  • IUPAC name

Download ChemCell

ChemCellGo to the GitHub page from "downloading ChemCell" of, and you can get it from "Clone or Download" → "Download zip".

 

How to use

Unzip the zip file and open chemcell.xls.If the macro is not enabled, click Security Center> Security Center Settings> Macro Settings to enable the macro.All you have to do is specify the cell that contains the compound notation to be converted and enter the following function.

= getSMILES ()

Outputs SMILES from IUPAC Name and CAS No.

= getInChIKey ()

InChIKey is output from IUPAC Name, CAS No., and SMILES.

Actually convert

I will try it with benzene. In PubChem, the compound notation of benzene is as follows.

IUPAC Name: Benzene CAS: 27271-55-2 SMILES: c1ccccc1 InChIKey: UHOVQNZJYSORNB-UHFFFAOYSA-N

The following output is output in chemcell.xls, and it is converted correctly.

By the way, when converting a list of thousands of compounds, it took several tens of minutes due to the processing.

How it works

https://cactus.nci.nih.gov/chemical/structure/ 「化合物の構造識別子」 / 「出力したい表現」

The mechanism of ChemCell is simple, the online compound structure notation conversion service of the National Cancer Institute "Chemical Identifier ResolverIs used. In Chemical Identifier Resolver, if you enter IUPAC Name, CAS No., etc. in a certain part of the URL, the corresponding structural formula notation will be returned.Since it is via a service operated by a regular national research institute, it seems that the output result by ChemCell can be used with some reliability.