Skip to content

docs object expects all word frequencies to be 1 - transformation from dfm object (quanteda) #10

@JonasRieger

Description

@JonasRieger

The docs object expects (for technical reasons) that all words occur with frequency 1. If words occur several times, they appear several times each with frequency 1.
In the quanteda package there are dfm objects that also allow values greater than 1. If you do your preprocessing in quanteda and want to use quanteda::dfm2lda to convert your object into the necessary structure, you need one more step to fulfill the requirements for the docs object. Just execute the following line:

docs = lapply(docs, function(x) rbind(rep(x[1,], x[2,]), 1))

This replicates words with multiple occurrences and protects you from the error message all(sapply(docs, function(x) all(x[2, ] == 1))) is not TRUE in LDARep and similar functions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    usabilityEnhancement of user friendliness

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions