| freqItems {SparkR} | R Documentation |
Finding frequent items for columns, possibly with false positives. Using the frequent element count algorithm described in http://dx.doi.org/10.1145/762471.762473, proposed by Karp, Schenker, and Papadimitriou.
## S4 method for signature 'SparkDataFrame,character' freqItems(x, cols, support = 0.01)
x |
A SparkDataFrame. |
cols |
A vector column names to search frequent items in. |
support |
(Optional) The minimum frequency for an item to be considered |
a local R data.frame with the frequent items in each column
freqItems since 1.6.0
Other stat functions: approxQuantile,
approxQuantile,SparkDataFrame,character,numeric,numeric-method;
corr, corr,
corr, corr,Column-method,
corr,SparkDataFrame-method;
cov, cov, cov,
cov,SparkDataFrame-method,
cov,characterOrColumn-method,
covar_samp, covar_samp,
covar_samp,characterOrColumn,characterOrColumn-method;
crosstab,
crosstab,SparkDataFrame,character,character-method;
sampleBy, sampleBy,
sampleBy,SparkDataFrame,character,list,numeric-method
## Not run:
df <- read.json("/path/to/file.json")
fi = freqItems(df, c("title", "gender"))
## End(Not run)