Filters¶
-
padua.filters.filter_localization_probability(df, threshold=0.75)[source]¶ Remove rows with a localization probability below 0.75
Return a
DataFramewhere the rows with a value < threshold (default 0.75) in column ‘Localization prob’ are removed. Filters data to remove poorly localized peptides (non Class-I by default).Parameters: - df – Pandas
DataFrame - threshold – Cut-off below which rows are discarded (default 0.75)
Returns: Pandas
DataFrame- df – Pandas
-
padua.filters.minimum_valid_values_in_any_group(df, levels=None, n=1, invalid=<Mock id='139759981893560'>)[source]¶ Filter
DataFrameby at least n valid values in at least one group.Taking a Pandas
DataFramewith aMultiIndexcolumn index, filters rows to remove rows where there are less than n valid values per group. Groups are defined by the levels parameter indexing into the column index. For example, aMultiIndexwith top and second level Group (A,B,C) and Replicate (1,2,3) usinglevels=[0,1]would filter on n valid values per replicate. Alternatively,levels=[0]would filter on nvalid values at the Group level only, e.g. A, B or C.By default valid values are determined by np.nan. However, alternatives can be supplied via invalid.
Parameters: - df – Pandas
DataFrame - levels –
listofintspecifying levels of columnMultiIndexto group by - n –
intminimum number of valid values threshold - invalid – matching invalid value
Returns: filtered Pandas
DataFrame- df – Pandas
-
padua.filters.remove_columns_containing(df, column, match)[source]¶ Return a
DataFramewith rows where column values containing match are removed.The selected column series of values from the supplied Pandas
DataFrameis compared to match, and those rows that contain it are removed from the DataFrame.Parameters: - df – Pandas
DataFrame - column – Column indexer
- match –
strmatch target
Returns: Pandas
DataFramefiltered- df – Pandas
-
padua.filters.remove_columns_matching(df, column, match)[source]¶ Return a
DataFramewith rows where column values match match are removed.The selected column series of values from the supplied Pandas
DataFrameis compared to match, and those rows that match are removed from the DataFrame.Parameters: - df – Pandas
DataFrame - column – Column indexer
- match –
strmatch target
Returns: Pandas
DataFramefiltered- df – Pandas
-
padua.filters.remove_only_identified_by_site(df)[source]¶ Remove rows with a + in the ‘Only identified by site’ column
Return a
DataFramewhere rows where there is a “+” in the column ‘Only identified by site’ are removed. Filters data to remove peptides matched as reverse.Parameters: df – Pandas DataFrameReturns: filtered Pandas DataFrame
-
padua.filters.remove_potential_contaminants(df)[source]¶ Remove rows with a + in the ‘Contaminants’ column
Return a
DataFramewhere rows where there is a “+” in the column ‘Contaminants’ are removed. Filters data to remove peptides matched as reverse.Parameters: df – Pandas DataFrameReturns: filtered Pandas DataFrame
-
padua.filters.remove_reverse(df)[source]¶ Remove rows with a + in the ‘Reverse’ column.
Return a
DataFramewhere rows where there is a “+” in the column ‘Reverse’ are removed. Filters data to remove peptides matched as reverse.Parameters: df – Pandas DataFrameReturns: filtered Pandas DataFrame
-
padua.filters.search(df, match, columns=['Proteins', 'Protein names', 'Gene names'])[source]¶ Search for a given string in a set of columns in a processed
DataFrame.Returns a filtered
DataFramewhere match is contained in one of the columns.Parameters: - df – Pandas
DataFrame - match –
strto search for in columns - columns –
listofstrto search for match
Returns: filtered Pandas
DataFrame- df – Pandas