EduNLP.I2V

class EduNLP.I2V.i2v.D2V(tokenizer, t2v, *args, tokenizer_kwargs: Optional[dict] = None, pretrained_t2v=False, **kwargs)[source]

The model aims to transfer item to vector directly.

I2V

Parameters
  • tokenizer (str) – the tokenizer name

  • t2v (str) – the name of token2vector model

  • args – the parameters passed to t2v

  • tokenizer_kwargs (dict) – the parameters passed to tokenizer

  • pretrained_t2v (bool) – True: use pretrained t2v model False: use your own t2v model

  • kwargs – the parameters passed to t2v

Examples

>>> item = {"如图来自古希腊数学家希波克拉底所研究的几何图形.此图由三个半圆构成,三个半圆的直径分别为直角三角形$ABC$的斜边$BC$,     ... 直角边$AB$, $AC$.$\bigtriangleup ABC$的三边所围成的区域记为$I$,黑色部分记为$II$, 其余部分记为$III$.在整个图形中随机取一点,    ... 此点取自$I,II,III$的概率分别记为$p_1,p_2,p_3$,则$\SIFChoice$$\FigureID{1}$"}
>>> model_path = "examples/test_model/test_gensim_luna_stem_tf_d2v_256.bin"
>>> i2v = D2V("text","d2v",filepath=model_path, pretrained_t2v = False)
>>> i2v(item)
([array([ ...dtype=float32)], None)
Returns

i2v model

Return type

I2V

infer_vector(items, tokenize=True, indexing=False, padding=False, key=<function D2V.<lambda>>, *args, **kwargs) tuple[source]

It is a function to switch item to vector. And before using the function, it is nesseary to load model.

Parameters
  • items (str) – the text of question

  • tokenize (bool) – True: tokenize the item

  • indexing (bool) –

  • padding (bool) –

  • key (lambda function) – the parameter passed to tokenizer, select the text to be processed

  • args – the parameters passed to t2v

  • kwargs – the parameters passed to t2v

Returns

vector

Return type

list

class EduNLP.I2V.i2v.I2V(tokenizer, t2v, *args, tokenizer_kwargs: Optional[dict] = None, pretrained_t2v=False, **kwargs)[source]

It just a api, so you shouldn’t use it directly. If you want to get vector from item, you can use other model like D2V and W2V.

Parameters
  • tokenizer (str) – the tokenizer name

  • t2v (str) – the name of token2vector model

  • args – the parameters passed to t2v

  • tokenizer_kwargs (dict) – the parameters passed to tokenizer

  • pretrained_t2v (bool) –

    True: use pretrained t2v model

    False: use your own t2v model

  • kwargs – the parameters passed to t2v

Examples

>>> item = {"如图来自古希腊数学家希波克拉底所研究的几何图形.此图由三个半圆构成,三个半圆的直径分别为直角三角形$ABC$的斜边$BC$,     ... 直角边$AB$, $AC$.$\bigtriangleup ABC$的三边所围成的区域记为$I$,黑色部分记为$II$, 其余部分记为$III$.在整个图形中随机取一点,    ... 此点取自$I,II,III$的概率分别记为$p_1,p_2,p_3$,则$\SIFChoice$$\FigureID{1}$"}
>>> model_path = "examples/test_model/test_gensim_luna_stem_tf_d2v_256.bin" 
>>> i2v = D2V("text","d2v",filepath=model_path, pretrained_t2v = False) 
>>> i2v(item) 
([array([...dtype=float32)], None)
Returns

i2v model

Return type

I2V

class EduNLP.I2V.i2v.W2V(tokenizer, t2v, *args, tokenizer_kwargs: Optional[dict] = None, pretrained_t2v=False, **kwargs)[source]

The model aims to transfer tokens to vector.

I2V

Parameters
  • tokenizer (str) – the tokenizer name

  • t2v (str) – the name of token2vector model

  • args – the parameters passed to t2v

  • tokenizer_kwargs (dict) – the parameters passed to tokenizer

  • pretrained_t2v (bool) – True: use pretrained t2v model False: use your own t2v model

  • kwargs – the parameters passed to t2v

Examples

>>> i2v = get_pretrained_i2v("test_w2v", "examples/test_model/data/w2v")
>>> item_vector, token_vector = i2v(["有学者认为:‘学习’,必须适应实际"])
>>> item_vector 
[array([...], dtype=float32)]
Returns

i2v model

Return type

W2V

infer_vector(items, tokenize=True, indexing=False, padding=False, key=<function W2V.<lambda>>, *args, **kwargs) tuple[source]

It is a function to switch item to vector. And before using the function, it is nesseary to load model.

Parameters
  • items (str) – the text of question

  • tokenize (bool) – True: tokenize the item

  • indexing (bool) –

  • padding (bool) –

  • key (lambda function) – the parameter passed to tokenizer, select the text to be processed

  • args – the parameters passed to t2v

  • kwargs – the parameters passed to t2v

Returns

vector

Return type

list

EduNLP.I2V.i2v.get_pretrained_i2v(name, model_dir='/home/docs/.EduNLP/model')[source]

It is a good idea if you want to switch item to vector earily.

Parameters
  • name (str) – the name of item2vector model e.g.: d2v_all_256 d2v_sci_256 d2v_eng_256 d2v_lit_256 w2v_sci_300 w2v_lit_300

  • model_dir (str) – the path of model, default: MODEL_DIR = ‘~/.EduNLP/model’

Returns

i2v model

Return type

I2V

Examples

>>> item = {"如图来自古希腊数学家希波克拉底所研究的几何图形.此图由三个半圆构成,三个半圆的直径分别为直角三角形$ABC$的斜边$BC$,     ... 直角边$AB$, $AC$.$\bigtriangleup ABC$的三边所围成的区域记为$I$,黑色部分记为$II$, 其余部分记为$III$.在整个图形中随机取一点,    ... 此点取自$I,II,III$的概率分别记为$p_1,p_2,p_3$,则$\SIFChoice$$\FigureID{1}$"}
>>> i2v = get_pretrained_i2v("test_d2v", "examples/test_model/data/d2v")
>>> print(i2v(item))
([array([ ...dtype=float32)], None)