| sampleBy {SparkR} | R Documentation |
Returns a stratified sample without replacement based on the fraction given on each stratum.
sampleBy(x, col, fractions, seed) ## S4 method for signature 'SparkDataFrame,character,list,numeric' sampleBy(x, col, fractions, seed)
x |
A SparkDataFrame |
col |
column that defines strata |
fractions |
A named list giving sampling fraction for each stratum. If a stratum is not specified, we treat its fraction as zero. |
seed |
random seed |
A new SparkDataFrame that represents the stratified sample
sampleBy since 1.6.0
Other stat functions: approxQuantile,
approxQuantile,SparkDataFrame,character,numeric,numeric-method;
corr, corr,
corr, corr,Column-method,
corr,SparkDataFrame-method;
cov, cov, cov,
cov,SparkDataFrame-method,
cov,characterOrColumn-method,
covar_samp, covar_samp,
covar_samp,characterOrColumn,characterOrColumn-method;
crosstab,
crosstab,SparkDataFrame,character,character-method;
freqItems,
freqItems,SparkDataFrame,character-method
## Not run:
df <- read.json("/path/to/file.json")
sample <- sampleBy(df, "key", fractions, 36)
## End(Not run)