Skip to content

edsnlp.pipes.core.normalizer.accents.accents

AccentsConverter [source]

Bases: BaseComponent

Normalises accents, using a same-length strategy.

Parameters

PARAMETER DESCRIPTION
nlp

The pipeline object.

TYPE: Optional[PipelineProtocol] DEFAULT: None

name

The component name.

TYPE: Optional[str] DEFAULT: None

accents

List of accentuated characters and their transcription.

TYPE: List[Tuple[str, str]] DEFAULT: [('ç', 'c'), ('àáâä', 'a'), ('èéêë', 'e'), ('ìí...

__call__ [source]

Remove accents from spacy NORM attribute.

Parameters

PARAMETER DESCRIPTION
doc

The spaCy Doc object.

TYPE: Doc

RETURNS DESCRIPTION
Doc

The document, with accents removed in Token.norm_.