PorterStemmer#
- class cuml.preprocessing.text.stem.PorterStemmer(mode='NLTK_EXTENSIONS')[source]#
A word stemmer based on the Porter stemming algorithm.
Porter, M. “An algorithm for suffix stripping.” Program 14.3 (1980): 130-137.
See http://www.tartarus.org/~martin/PorterStemmer/ for the homepage of the algorithm.
Martin Porter has endorsed several modifications to the Porter algorithm since writing his original paper, and those extensions are included in the implementations on his website. Additionally, others have proposed further improvements to the algorithm, including NLTK contributors. Only below mode is supported currently PorterStemmer.NLTK_EXTENSIONS
Implementation that includes further improvements devised by NLTK contributors or taken from other modified implementations found on the web.
- Parameters:
- mode: Modes of stemming (Only supports (NLTK_EXTENSIONS) currently)
default(“NLTK_EXTENSIONS”)
Methods
stem(word_str_ser)Stem Words using Porter stemmer
Examples
>>> import cudf >>> from cuml.preprocessing.text.stem import PorterStemmer >>> stemmer = PorterStemmer() >>> word_str_ser = cudf.Series(['revival','singing','adjustable']) >>> print(stemmer.stem(word_str_ser)) 0 reviv 1 sing 2 adjust dtype: object