The Bio::PubMed class provides several ways to retrieve bibliographic information from the PubMed database at
http://www.ncbi.nlm.nih.gov/sites/entrez?db=PubMed
Basically, two types of queries are possible:
searching for PubMed IDs given a query string:
Bio::PubMed#esearch (recommended)
Bio::PubMed#search (only retrieves top 20 hits)
retrieving the MEDLINE text (i.e. authors, journal, abstract, …) given a PubMed ID
Bio::PubMed#efetch (recommended)
Bio::PubMed#query (unstable for the change of the HTML design)
Bio::PubMed#pmfetch (still working but could be obsoleted by NCBI)
The different methods within the same group are interchangeable and should return the same result.
Additional information about the MEDLINE format and PubMed programmable APIs can be found on the following websites:
PubMed Overview:
http://www.ncbi.nlm.nih.gov/entrez/query/static/overview.html
PubMed help:
http://www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html
Entrez utilities index:
http://www.ncbi.nlm.nih.gov/entrez/utils/utils_index.html
How to link:
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helplinks.chapter.linkshelp
require 'bio' # If you don't know the pubmed ID: Bio::PubMed.esearch("(genome AND analysis) OR bioinformatics").each do |x| p x end Bio::PubMed.search("(genome AND analysis) OR bioinformatics").each do |x| p x end # To retrieve the MEDLINE entry for a given PubMed ID: puts Bio::PubMed.efetch("10592173", "14693808") puts Bio::PubMed.query("10592173") puts Bio::PubMed.pmfetch("10592173") # This can be converted into a Bio::MEDLINE object: manuscript = Bio::PubMed.query("10592173") medline = Bio::MEDLINE.new(manuscript)
# File lib/bio/io/pubmed.rb, line 204 def self.efetch(*args) self.new.efetch(*args) end
# File lib/bio/io/pubmed.rb, line 200 def self.esearch(*args) self.new.esearch(*args) end
# File lib/bio/io/pubmed.rb, line 216 def self.pmfetch(*args) self.new.pmfetch(*args) end
Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez efetch. Multiple PubMed IDs can be provided:
Bio::PubMed.efetch(123) Bio::PubMed.efetch([123,456,789])
Arguments:
ids: list of PubMed IDs (required)
hash: hash of E-Utils options
retmode: "xml", "html", ...
rettype: "medline", ...
retmax: integer (default 100)
retstart: integer
field
reldate
mindate
maxdate
datetype
Returns |
# File lib/bio/io/pubmed.rb, line 117 def efetch(ids, hash = {}) opts = { "db" => "pubmed", "rettype" => "medline" } opts.update(hash) result = super(ids, opts) if !opts["retmode"] or opts["retmode"] == "text" result = result.split(/\n\n+/) end result end
Search the PubMed database by given keywords using E-Utils and returns an array of PubMed IDs.
For information on the possible arguments, see eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html#PubMed
Arguments:
str: query string (required)
hash: hash of E-Utils options
retmode: "xml", "html", ...
rettype: "medline", ...
retmax: integer (default 100)
retstart: integer
field
reldate
mindate
maxdate
datetype
Returns |
array of PubMed IDs or a number of results |
# File lib/bio/io/pubmed.rb, line 93 def esearch(str, hash = {}) opts = { "db" => "pubmed" } opts.update(hash) super(str, opts) end
Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez pmfetch.
Arguments:
id: PubMed ID (required)
Returns |
# File lib/bio/io/pubmed.rb, line 183 def pmfetch(id) host = "www.ncbi.nlm.nih.gov" path = "/entrez/utils/pmfetch.fcgi?tool=bioruby&mode=text&report=medline&db=PubMed&id=" ncbi_access_wait http = Bio::Command.new_http(host) response = http.get(path + CGI.escape(id.to_s)) result = response.body if result =~ /#{id}\s+Error/ raise( result ) else result = result.gsub("\r", "\n").squeeze("\n").gsub(/<\/?pre>/, '') return result end end
Retrieve PubMed entry by PMID and returns MEDLINE formatted string using entrez query.
Arguments:
id: PubMed ID (required)
Returns |
# File lib/bio/io/pubmed.rb, line 153 def query(*ids) host = "www.ncbi.nlm.nih.gov" path = "/sites/entrez?tool=bioruby&cmd=Text&dopt=MEDLINE&db=PubMed&uid=" list = ids.collect { |x| CGI.escape(x.to_s) }.join(",") ncbi_access_wait http = Bio::Command.new_http(host) response = http.get(path + list) result = response.body result = result.scan(/<pre>\s*(.*?)<\/pre>/).flatten if result =~ /id:.*Error occurred/ # id: xxxxx Error occurred: Article does not exist raise( result ) else if ids.size > 1 return result else return result.first end end end
Search the PubMed database by given keywords using entrez query and returns an array of PubMed IDs. Caution: this method returns the first 20 hits only. Instead, use of the ‘esearch’ method is strongly recomended.
Arguments:
id: query string (required)
Returns |
array of PubMed IDs |
# File lib/bio/io/pubmed.rb, line 134 def search(str) host = "www.ncbi.nlm.nih.gov" path = "/sites/entrez?tool=bioruby&cmd=Search&doptcmdl=Brief&db=PubMed&term=" ncbi_access_wait http = Bio::Command.new_http(host) response = http.get(path + CGI.escape(str)) result = response.body result = result.scan(/value="(\d+)" id="UidCheckBox"/).flatten return result end
Generated with the Darkfish Rdoc Generator 2.