geochemistrypi.data_mining.plot package¶
Submodules¶
geochemistrypi.data_mining.plot.geochemistry_plot module¶
geochemistrypi.data_mining.plot.map_plot module¶
- map_projected_by_basemap(col: Series, name_column: str, longitude: DataFrame, latitude: DataFrame) None[source]¶
Project an element data into world map using basemap.
- Parameters:
col (pd.Series) – One selected column from the data sheet.
longitude (pd.DataFrame) – Longitude data of data items.
latitude (pd.DataFrame) – Latitude data of data items.
- map_projected_by_cartopy(col: Series, name_column: str, longitude: DataFrame, latitude: DataFrame) None[source]¶
Project an element data into world map using cartopy.
- Parameters:
col (pd.Series) – One selected column from the data sheet.
longitude (pd.DataFrame) – Longitude data of data items.
latitude (pd.DataFrame) – Latitude data of data items.
geochemistrypi.data_mining.plot.statistic_plot module¶
- basic_statistic(data: DataFrame) None[source]¶
Some basic statistic information of the designated data set.
- Parameters:
data (pd.DataFrame) – The data set.
- check_missing_value(data: DataFrame) bool[source]¶
Check whether the data set has null value or not.
- Parameters:
data (pd.DataFrame) – The data set.
- Returns:
flag – True if it has null value.
- Return type:
bool
- correlation_plot(col: Index, df: DataFrame, name_column: str) None[source]¶
A heatmap describing the correlation between the required columns.
- Parameters:
col (pd.Index) – A list of columns that need to plot.
df (pd.DataFrame) – The data set.
- distribution_plot(col: Index, df: DataFrame, name_column: str) None[source]¶
The histogram containing the respective distribution subplots of the required columns.
- Parameters:
col (pd.Index) – A list of columns that need to plot.
df (pd.DataFrame) – The data set.
- is_null_value(data: DataFrame) None[source]¶
Check whether the data set has null value or not.
- Parameters:
data (pd.DataFrame) – The data set.
- log_distribution_plot(col: Index, df: DataFrame, name_column: str) None[source]¶
The histogram containing the respective distribution subplots after log transformation of the required columns.
- Parameters:
col (pd.Index) – A list of columns that need to plot.
df (pd.DataFrame) – The data set.
- probability_plot(col: Index, df_origin: DataFrame, df_impute: DataFrame, name_column: str) None[source]¶
A large graph containing the respective probability plots (origin vs. impute) of the required columns.
- Parameters:
col (pd.Index) – A list of columns that need to plot.
df_origin (pd.DataFrame (n_samples, n_components)) – The original dataset with missing value.
df_impute (pd.DataFrame (n_samples, n_components)) – The dataset after imputation.