EUtils
EUtils is a client-side library for the Entrez databases at NCBI.
NCBI provides the
EUtils
web service so that software can query Entrez directly, rather than
going through the web interface and dealing with the hassles of web
scraping.
This package provides two levels of interface. The lowest one makes a
programmatic interface to construct the query URL and make the
request. The higher level ones support history tracking and parsing
of query results. These greatly simplify working with the EUtils
server.
Download: EUtils-1.0p1.tar.gz
[CHANGELOG]
EUtils is distributed under the
Biopython License
To purchase commercial support or to hire us to develop
customized tools built using EUtils, contact
info@dalkescientific.com.
Example: Get all protein sequences related to protein GI:4579714:
>>> import EUtils
>>> from EUtils import HistoryClient
>>> client = HistoryClient.HistoryClient()
>>> result = client.post(EUtils.DBIds("protein", "4579714"))
>>> related = result.neighbor_links("protein")
>>> related_dbids = related.linksetdbs["protein_protein"].dbids
>>> proteins = client.post(related_dbids)
>>> len(proteins)
223
>>> infile = proteins.efetch(retmode = "text", rettype = "fasta")
>>>
>>> fasta = infile.read()
>>> print fasta[:788]
>gi|27450749|gb|AAO14677.1|AF508258_1 rhodopsin [Pyrocystis lunula]
MAPIPDGFTYGQWSLVYNSLSFGIAGMGCATIFFWLQLPNVSKSYRTALTITGLVTAIATYHYVRIFNSW
VDAFKVVNVNGGDYTVTLLGAPFNDAYRYVDWLLTVPLLLIELILVMKLPKAETVKLSWNLGVASAVMVA
LGYPGEIQDDLLVRWFWWAMAMIPFYYVVVTLVNGLSDATAKQPDSVKSLVVTARYLTVISWLTYPGVYI
IKSMGLAGNIATTYEQVGYSVADVVAKAVFGVLIWAIAAGKSDEEEKNGLLG
>gi|6319528|ref|NP_009610.1| Homolog to HSP30 heat shock protein Yro1p; Yro2p [Saccharomyces cerevisiae]
MSDYVELLKRGGNEAIKINPPTGADFHITSRGSDWLFTVFCVNLLFGVILVPLMFRKPVKDRFVYYTAIA
PNLFMSIAYFTMASNLGWIPVRAKYNHVQTSTQKEHPGYRQIFYARYVGWFLAFPWPIIQMSLLGGTPLW
QIAFNVGMTEIFTVCWLIAACVHSTYKWGYYTIGIGAAIVVCISLMTTTFNLVKARGKDVSNVFITFMSV
IMFLWLIAYPTCFGITDGGNVLQPDSATIFYGIIDLLILSILPVLFMPLANYLGIERLGLIFDEEPAEHV
GPVAEKKMPSPASFKSSDSDSSIKEKLKLKKKHKKDKKKAKKAKKAKKAKKAQEEEEDVATDSE
>>>