Introduction
PteRedactyl is a redaction package for personally identifiable information (PII) in text, that combines NER models with regex matching to identify and mask sensitive information. Custom transformers-based NER models can be swapped in,
This guide will cover usage of PteRedactyl's main functions: create_analyser
, analyse
, anonymise
and anonymise_df
:
create_analyser()
: Creates a presidio AnalyserEngine that can be reused across high-level functionsanalyse()
: Analyses a string for PII, returning a list of recognised NER or regex-based results with scoresanonymise()
: Analyses and anonymises text, replacing PII with placeholders or hide-in-plain-sight (HIPS) replacementsanonymise_df()
: As above, but by looping over a given column (or columns) in a DataFrame