API Reference

The semanticizest API is not stable and may change without notice.

Semanticizest

class semanticizest.Semanticizer(fname)

Entity linker.

Parameters:

fname : string

Filename of the stored model from which to load the Wikipedia statistics.

Methods

all_candidates(s) Retrieve all candidate entities.
all_candidates(s)

Retrieve all candidate entities.

Parameters:

s : {string, iterable over string}

Tokens. If a string, it will be tokenized using a naive heuristic.

Returns:

candidates : iterable over (int, int, string, float)

Candidate entities are 4-tuples of the indices start and end (both in tokenized input, and both start at 1), target entity (title of the Wikipedia article) and probability (commonness.)

utils

semanticizest._util.ngrams(lst, N=None)[source]

Generate bare n-grams from a list of strings.

See also

ngrams_with_pos
for a description of the arguments.
semanticizest._util.ngrams_with_pos(lst, N=None)[source]

Generate n-grams with indices from a list of strings.

Parameters:

lst : list-like of strings

N : int, optional

Maximum n-gram length, defaults to the length of lst.

Returns :

—– :

tuple (start, end, n-gram) :

Tuples are start and end index in the original list lst, the n-gram is the space joined string value. The n-grams are yielded in leftmost longest order.

Raises:

TypeError :

If N is not an integer.

ValueError :

If N is not at least 1.

semanticizest._util.tosequence(x)[source]

Cast x to sequence. Returns x if at all possible.

semanticizest._util.url_from_title(title, wiki)[source]

Turn an article title into a Wikipedia URL.

Parameters:

wiki : string

Denotes the specific Wikipedia (language), e.g. “en”.

Table Of Contents

Previous topic

Welcome to semanticizest’s documentation!

This Page