Package edu.caltech.nanodb.queryeval
Class ColumnStatsCollector
- java.lang.Object
-
- edu.caltech.nanodb.queryeval.ColumnStatsCollector
-
public class ColumnStatsCollector extends java.lang.Object
This class facilitates the collection of statistics for a single column of a table being analyzed by theTableManager.analyzeTable(edu.caltech.nanodb.relations.TableInfo)
method. Instances of the class compute the number of distinct values, the number of non-NULL values, and for appropriate data types, the minimum and maximum values for the column.The class also makes it very easy to construct a
ColumnStats
object from the result of the analysis.- Design Note:
- (Donnie) This class is limited in its ability to efficiently compute the number of unique values for very large tables. An external-memory approach would have to be used to support extremely large tables.
-
-
Field Summary
Fields Modifier and Type Field Description (package private) java.lang.Comparable
maxValue
The maximum value seen in the column's values, or null if the maximum is unknown or won't be computed.(package private) java.lang.Comparable
minValue
The minimum value seen in the column's values, or null if the minimum is unknown or won't be computed.private int
numNullValues
A count of the number of NULL values seen in the column-values.private SQLDataType
sqlType
The SQL data-type for the column that stats are being collected for.private java.util.HashSet<java.lang.Object>
uniqueValues
The set of all values seen in this column.
-
Constructor Summary
Constructors Constructor Description ColumnStatsCollector(SQLDataType sqlType)
Initializes a new column-stats collector object for a column with the specified base SQL datatype.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addValue(java.lang.Object value)
Adds another column-value to this stats-collector object, updating the statistics for the column.ColumnStats
getColumnStats()
This helper method constructs and returns a new column-statistics object containing the stats collected by this object.java.lang.Object
getMaxValue()
Returns the maximum value seen for the column, or null if the column's type isn't supported for comparison estimates (or if there aren't any rows in the table being analyzed).java.lang.Object
getMinValue()
Returns the minimum value seen for the column, or null if the column's type isn't supported for comparison estimates (or if there aren't any rows in the table being analyzed).int
getNumNullValues()
Returns the number of NULL values seen for the column.int
getNumUniqueValues()
Returns the number of unique (and non-NULL) values seen for the column.
-
-
-
Field Detail
-
sqlType
private SQLDataType sqlType
The SQL data-type for the column that stats are being collected for.
-
uniqueValues
private java.util.HashSet<java.lang.Object> uniqueValues
The set of all values seen in this column. This set could obviously occupy a large amount of memory for large tables.
-
numNullValues
private int numNullValues
A count of the number of NULL values seen in the column-values.
-
minValue
java.lang.Comparable minValue
The minimum value seen in the column's values, or null if the minimum is unknown or won't be computed.
-
maxValue
java.lang.Comparable maxValue
The maximum value seen in the column's values, or null if the maximum is unknown or won't be computed.
-
-
Constructor Detail
-
ColumnStatsCollector
public ColumnStatsCollector(SQLDataType sqlType)
Initializes a new column-stats collector object for a column with the specified base SQL datatype.- Parameters:
sqlType
- the base SQL datatype for the column.
-
-
Method Detail
-
addValue
public void addValue(java.lang.Object value)
Adds another column-value to this stats-collector object, updating the statistics for the column.- Parameters:
value
- the value from the column being analyzed.- Design Note:
- (Donnie) We have to suppress "unchecked operation" warnings on
this code, since
Comparable
is a generic (and thus allows us to specify the type of object being compared), but we want to use it without specifying any types.
-
getNumNullValues
public int getNumNullValues()
Returns the number of NULL values seen for the column.- Returns:
- the number of NULL values seen for the column
-
getNumUniqueValues
public int getNumUniqueValues()
Returns the number of unique (and non-NULL) values seen for the column.- Returns:
- the number of unique (and non-NULL) values seen for the column
-
getMinValue
public java.lang.Object getMinValue()
Returns the minimum value seen for the column, or null if the column's type isn't supported for comparison estimates (or if there aren't any rows in the table being analyzed).- Returns:
- the minimum value in the table for the column
-
getMaxValue
public java.lang.Object getMaxValue()
Returns the maximum value seen for the column, or null if the column's type isn't supported for comparison estimates (or if there aren't any rows in the table being analyzed).- Returns:
- the maximum value in the table for the column
-
getColumnStats
public ColumnStats getColumnStats()
This helper method constructs and returns a new column-statistics object containing the stats collected by this object.- Returns:
- a new column-stats object containing the stats that have been collected by this object
-
-