文献のアブストラクトを取得してTF-IDFを作成する

<環境>

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale:
[1] LC_COLLATE=Japanese_Japan.932  LC_CTYPE=Japanese_Japan.932   
[3] LC_MONETARY=Japanese_Japan.932 LC_NUMERIC=C                  
[5] LC_TIME=Japanese_Japan.932    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] RevoUtils_10.0.6     RevoUtilsMath_10.0.1

loaded via a namespace (and not attached):
[1] compiler_3.4.2 tools_3.4.
> packageVersion("text2vec")
[1] ‘0.5.0’
> packageVersion("xml2")
[1] ‘1.1.1’
> packageVersion("dplyr")
[1] ‘0.7.4’
> packageVersion("tm")
[1] ‘0.7.1’

【0】使用するライブラリの読み込み

library(text2vec)
library(xml2)
library(dplyr)
library(tm)

【1】データ(文献のアブストラクト)を取得する

address <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&mode=XML&id="

pubmedID <- c("25458663", "26200944", "27118687")

getAbst <- function(pmid) {
    myXML <- read_xml(paste(address, pmid, sep = ""))
    return(myXML %>% xml_find_all(xpath = "//AbstractText") %>% xml_text() %>% paste(collapse = " "))
}

Absts <- sapply(pubmedID, getAbst)
Absts <- stemDocument(Absts, "en")
Absts <- removeNumbers(Absts)

【2】単語を抽出する

it <- itoken(Absts, tolower, word_tokenizer)
voc <- create_vocabulary(it, stopwords = stopwords("en"))
> head(voc[order(voc$term_count, decreasing = T),], n = 20)
Number of docs: 3 
174 stopwords: i, me, my, myself, we, our ... 
ngram_min = 1; ngram_max = 1 
Vocabulary: 
              term term_count doc_count
 1:             ci         22         2
 2:            ckd         17         3
 3:        current         13         3
 4:             rr         12         1
 5:              m         12         3
 6:           risk         11         3
 7:         smoker         11         3
 8:             kg         10         1
 9:           caus          9         3
10: cardiovascular          8         2
11:         diseas          8         3
12:         associ          8         2
13:         events          8         2
14:           rate          7         3
15:          smoke          7         2
16:          never          7         2
17:             hr          6         1
18:            bmi          6         1
19:             vs          6         1
20:       lifestyl          6         1

【3】DTM(Document-Term-Matrix)を作成する

dtm <- create_dtm(it, vocab_vectorizer(voc))

【4】TF-IDF(Term Frequency-Inverse Document Frequency)を作成する

model_tfidf = TfIdf$new()
dtm_tfidf = model_tfidf$fit_transform(dtm)


うまくいっているような、いっていないような・・・。
さてさてここからどうしよう?