Filters rows and/or selects columns of a DTSg object.
# S3 method for class 'DTSg'
subset(
x,
i,
cols = self$cols(),
funby = NULL,
ignoreDST = FALSE,
na.status = "implicit",
clone = getOption("DTSgClone"),
multiplier = 1L,
funbyHelpers = NULL,
funbyApproach = self$funbyApproach,
...
)A DTSg object (S3 method only).
An integerish vector indexing rows (positive numbers pick and
negative numbers omit rows) or a filter expression accepted by the i
argument of data.table::data.table. Filter expressions can contain the
special symbol .N.
A character vector specifying the columns to select. Another
possibility is a character string containing either comma separated column
names, for example, "x,y,z", or the start and end column separated by a
colon, for example, "x:z". The .dateTime column is always selected and
cannot be part of it.
One of the temporal aggregation level functions described in
TALFs or a user defined temporal aggregation level function. Can be
used to, for instance, select the last two observations of a certain
temporal level. See corresponding section and examples for further
information.
A logical specifying if day saving time shall be ignored
by funby. See corresponding section for further information.
A character string. Either "explicit", which makes missing
timestamps explicit according to the recognised periodicity, or
"implicit", which removes timestamps with missing values on all value
columns. See corresponding section for further information.
A logical specifying if the object shall be modified in place or if a deep clone (copy) shall be made beforehand.
A positive integerish value “multiplying” the
temporal aggregation level of certain TALFs. See corresponding section
for further information.
An optional list with helper data passed on to
funby. See corresponding section for further information.
A character string specifying the flavour of the applied
temporal aggregation level function. Either "timechange", which utilises
timechange::time_floor, or "base", which utilises as.POSIXct, or
"fasttime", which utilises fasttime::fastPOSIXct, or "RcppCCTZ",
which utilises RcppCCTZ::parseDatetime as the main function for
transforming timestamps.
Further arguments passed on to fun.
Returns a DTSg object.
Please note that filtering rows and having or making missing timestamps
explicit equals to setting the values of all other timestamps to missing. The
default value of na.status is therefore "implicit". To simply filter for
a consecutive range of a DTSg object while leaving the na.status
untouched, alter is probably the better choice.
User defined temporal aggregation level functions have to return a
POSIXct vector of the same length as the time series and accept two
arguments: a POSIXct vector as its first and a list with helper data
as its second. The default elements of this list are as follows:
timezone: Same as the timezone field.
ignoreDST: Same as the ignoreDST argument.
periodicity: Same as the periodicity field.
na.status: Same as the na.status field.
multiplier: Same as the multiplier argument.
funbyApproach: Same as the funbyApproach argument.
Any additional element specified in the funbyHelpers argument is appended
to the end of the default list. In case funbyHelpers contains an
ignoreDST, multiplier or funbyApproach element, it takes precedence over
the respective method argument. timezone, periodicity and na.status
elements are rejected, as they are always taken directly from the object.
The temporal aggregation level of certain TALFs can be adjusted with the
help of the multiplier argument. A multiplier of 10, for example, makes
byY_____ aggregate to decades instead of years. Another example
is a multiplier of 6 provided to by_m____. The function
then aggregates all months of all first and all months of all second half
years instead of all months of all years separately. This feature is
supported by the following TALFs of the package:
ignoreDST tells a temporal aggregation level function if it is supposed to
ignore day saving time while transforming the timestamps. This can be a
desired feature for time series strictly following the position of the sun
such as hydrological time series. Doing so ensures that diurnal variations
are preserved by all means and all intervals are of the “correct”
length, however, a possible limitation might be that the day saving time
shift is invariably assumed to be one hour long. This feature requires that
the periodicity of the time series has been recognised and is supported by
the following TALFs of the package:
# new DTSg object
x <- DTSg$new(values = flow)
# filter for the first six observations
## R6 method
x$subset(i = 1:6)$print()
#> Values:
#> .dateTime flow
#> <POSc> <num>
#> 1: 2007-01-01 9.540
#> 2: 2007-01-02 9.285
#> 3: 2007-01-03 8.940
#> 4: 2007-01-04 8.745
#> 5: 2007-01-05 8.490
#> 6: 2007-01-06 8.400
#>
#> Aggregated: FALSE
#> Regular: TRUE
#> Periodicity: Time difference of 1 days
#> Missing values: implicit
#> Time zone: UTC
#> Timestamps: 6
## S3 method
print(subset(x = x, i = 1:6))
#> Values:
#> .dateTime flow
#> <POSc> <num>
#> 1: 2007-01-01 9.540
#> 2: 2007-01-02 9.285
#> 3: 2007-01-03 8.940
#> 4: 2007-01-04 8.745
#> 5: 2007-01-05 8.490
#> 6: 2007-01-06 8.400
#>
#> Aggregated: FALSE
#> Regular: TRUE
#> Periodicity: Time difference of 1 days
#> Missing values: implicit
#> Time zone: UTC
#> Timestamps: 6
# filter for the last two observations per year
## R6 method
x$subset(
i = (.N - 1):.N,
funby = function(x, ...) {data.table::year(x)}
)$print()
#> Values:
#> .dateTime flow
#> <POSc> <num>
#> 1: 2007-12-30 11.49
#> 2: 2007-12-31 11.61
#> 3: 2008-12-30 12.54
#> 4: 2008-12-31 11.94
#> 5: 2009-12-30 10.11
#> ---
#> 8: 2010-12-31 9.87
#> 9: 2011-12-30 8.04
#> 10: 2011-12-31 7.71
#> 11: 2012-12-30 18.84
#> 12: 2012-12-31 17.25
#>
#> Aggregated: FALSE
#> Regular: TRUE
#> Periodicity: Time difference of 1 days
#> Missing values: implicit
#> Time zone: UTC
#> Timestamps: 12
## S3 method
print(subset(
x = x,
i = (.N - 1):.N,
funby = function(x, ...) {data.table::year(x)}
))
#> Values:
#> .dateTime flow
#> <POSc> <num>
#> 1: 2007-12-30 11.49
#> 2: 2007-12-31 11.61
#> 3: 2008-12-30 12.54
#> 4: 2008-12-31 11.94
#> 5: 2009-12-30 10.11
#> ---
#> 8: 2010-12-31 9.87
#> 9: 2011-12-30 8.04
#> 10: 2011-12-31 7.71
#> 11: 2012-12-30 18.84
#> 12: 2012-12-31 17.25
#>
#> Aggregated: FALSE
#> Regular: TRUE
#> Periodicity: Time difference of 1 days
#> Missing values: implicit
#> Time zone: UTC
#> Timestamps: 12