Filters¶
-
padua.filters.
filter_exclude
(df, s)[source]¶ Filter dataframe to exclude matching columns, based on search for “s”
Parameters: s – string to search for, exclude matching columns
-
padua.filters.
filter_intensity
(df, label='')[source]¶ Filter to include only the Intensity values with optional specified label, excluding other Intensity measurements, but retaining all other columns.
-
padua.filters.
filter_intensity_lfq
(df, label='')[source]¶ Filter to include only the Intensity values with optional specified label, excluding other Intensity measurements, but retaining all other columns.
-
padua.filters.
filter_localization_probability
(df, threshold=0.75)[source]¶ Remove rows with a localization probability below 0.75
Return a
DataFrame
where the rows with a value < threshold (default 0.75) in column ‘Localization prob’ are removed. Filters data to remove poorly localized peptides (non Class-I by default).Parameters: - df – Pandas
DataFrame
- threshold – Cut-off below which rows are discarded (default 0.75)
Returns: Pandas
DataFrame
- df – Pandas
-
padua.filters.
filter_select_columns
(df, columns)[source]¶ Filter dataframe to include specified columns, retaining any Intensity columns.
-
padua.filters.
minimum_valid_values_in_any_group
(df, levels=None, n=1, invalid=<Mock id='140020034530328'>)[source]¶ Filter
DataFrame
by at least n valid values in at least one group.Taking a Pandas
DataFrame
with aMultiIndex
column index, filters rows to remove rows where there are less than n valid values per group. Groups are defined by the levels parameter indexing into the column index. For example, aMultiIndex
with top and second level Group (A,B,C) and Replicate (1,2,3) usinglevels=[0,1]
would filter on n valid values per replicate. Alternatively,levels=[0]
would filter on nvalid values at the Group level only, e.g. A, B or C.By default valid values are determined by np.nan. However, alternatives can be supplied via invalid.
Parameters: - df – Pandas
DataFrame
- levels –
list
ofint
specifying levels of columnMultiIndex
to group by - n –
int
minimum number of valid values threshold - invalid – matching invalid value
Returns: filtered Pandas
DataFrame
- df – Pandas
-
padua.filters.
remove_columns_containing
(df, column, match)[source]¶ Return a
DataFrame
with rows where column values containing match are removed.The selected column series of values from the supplied Pandas
DataFrame
is compared to match, and those rows that contain it are removed from the DataFrame.Parameters: - df – Pandas
DataFrame
- column – Column indexer
- match –
str
match target
Returns: Pandas
DataFrame
filtered- df – Pandas
-
padua.filters.
remove_columns_matching
(df, column, match)[source]¶ Return a
DataFrame
with rows where column values match match are removed.The selected column series of values from the supplied Pandas
DataFrame
is compared to match, and those rows that match are removed from the DataFrame.Parameters: - df – Pandas
DataFrame
- column – Column indexer
- match –
str
match target
Returns: Pandas
DataFrame
filtered- df – Pandas
-
padua.filters.
remove_contaminants
(df)[source]¶ Remove rows with a + in the ‘Contaminants’ column
Return a
DataFrame
where rows where there is a “+” in the column ‘Contaminants’ are removed. Filters data to remove peptides matched as reverse.Parameters: df – Pandas DataFrame
Returns: filtered Pandas DataFrame
-
padua.filters.
remove_only_identified_by_site
(df)[source]¶ Remove rows with a + in the ‘Only identified by site’ column
Return a
DataFrame
where rows where there is a “+” in the column ‘Only identified by site’ are removed. Filters data to remove peptides matched as reverse.Parameters: df – Pandas DataFrame
Returns: filtered Pandas DataFrame
-
padua.filters.
remove_potential_contaminants
(df)[source]¶ Remove rows with a + in the ‘Potential contaminant’ column
Return a
DataFrame
where rows where there is a “+” in the column ‘Contaminants’ are removed. Filters data to remove peptides matched as reverse.Parameters: df – Pandas DataFrame
Returns: filtered Pandas DataFrame
-
padua.filters.
remove_reverse
(df)[source]¶ Remove rows with a + in the ‘Reverse’ column.
Return a
DataFrame
where rows where there is a “+” in the column ‘Reverse’ are removed. Filters data to remove peptides matched as reverse.Parameters: df – Pandas DataFrame
Returns: filtered Pandas DataFrame
-
padua.filters.
search
(df, match, columns=['Proteins', 'Protein names', 'Gene names'])[source]¶ Search for a given string in a set of columns in a processed
DataFrame
.Returns a filtered
DataFrame
where match is contained in one of the columns.Parameters: - df – Pandas
DataFrame
- match –
str
to search for in columns - columns –
list
ofstr
to search for match
Returns: filtered Pandas
DataFrame
- df – Pandas