Frequently asked questions

Labs

1. Argument 1 to “get_top_n” has incompatible type “Dict[str, int]”; expected “Dict[str, Union[int, float]]” [arg-type]

This problem frequently occurs in lab_1_keywords_tfidf and is easily fixed.

Typically, this remark is followed by 2 notes:

  • note: "Dict" is invariant -- see [link]

  • note: Consider using "Mapping" instead, which is covariant in the value type

Although error message may not seem to be particularly clear, it is rather simple to fix it. To solve the problem it is enough to carefully follow the task description. In task description, students are required to demonstrate get_top_n using two dictionaries: the one with TF-IDF scores and the one with chi-values. Both of those dictionaries contain float data as values, and such usage does not cause any problems.

Issues begin when one decides to use get_top_n on frequency dictionary, which is not required in task description. Frequency dictionaries have integer values which does not match very well with the get_top_n typing in this particular lab. This is why it leads to MyPy complaining.

To fix this problem, only use get_top_n on dictionaries with float values.

2. Cannot find implementation or library stub for module named “main” [import]

Let’s say the structure of your project looks like this:

+-- 2023-2-level-labs
    +-- config
    +-- docs
    +-- lab_1_keywords_tfidf
        +-- assets
        +-- tests
        +-- main.py
        +-- start.py
        +-- target_score.txt
        +-- README.md
    +-- seminars
...

You want to import functions from main.py. To do that, remember that the checking program looks at your code from the root folder, meaning that for it the correct name of the main.py would be the following: lab_1_keywords_tfidf/main.py

This is why to import functions from main.py in your start.py you need to put it the following way:

from lab_1_keywords_tfidf.main import <functions you want to import>

3. Argument 1 to <function name> Has incompatible type “Optional[<certain type>]”; expected “[<certain type>]”

In some of the laboratory works there is a requirement to check input data. In other words, apart from main logic of the function, one should verify that all input arguments are of the expected type, and, for example, return None otherwise. This is precisely why this MyPy warning is raised: if in a sequence of two functions the former one can return None as an indicator of corrupt data, and the latter one does not expect None among correct input values, there is a risk of passing data that is obviously incorrect.

To avoid this MyPy remark, it is necessary to check whether the returned value is not None before proceeding to feed it to the second function.

For example, let’s say we have the following two functions. The first one unites two lists, and the second one sums all the elements in the list.

def function1(arg1: list[int], arg2: list[int]) -> Optional[list[int]]:
    if not arg1 or not arg2:
        return None
    return arg1 + arg2

def function2(arg: list[int]) -> Optional[int]:
    if not arg:
        return None
    return sum(arg)

We want to use these functions sequentially: firstly we want to unite two lists, and then find its sum. This is an incorrect way to do that:

united_list = function1(list1, list2)
elements_sum = function2(united_list)

function1 can return None, and we must not pass it to function2. Correct way to check it:

united_list = function1(list1, list2)
if united_list:
    elements_sum = function2(united_list)

4. Incompatible types in assignment (expression has type X, variable has type Y)

Python is a dynamically typed programming language, meaning that during execution of a program in Python same variables can be assigned values of different types. Although it is not prohibited in the language, it may still be not the best practice. Reusing variables in such a way can make your code more vulnerable as there would be a higher probability of making a mistake that is hard to track. This is why MyPy highlights such variables: maintaining consistency of typing throughout value re-assigning should solve this problem. More about incompatible re-definitions. More about perks of mypy-style static typing.

5. During working in PyCharm, interpreter cannot be found

In many cases the issue turns out to be wrong opening of the PyCharm. Make sure that you open the whole 202X-2-level-labs as a project, not just the folder with a particular lab.

More details on correct PyCharm opening can be found Подготовка к прохождению курса.

Running tests

1. Why is my CI job cancelled?

Usually that happens because your CI check runs for too long. Possible reasons is that you do not control number of articles that you collect from your seed URL. If you feel that the problem is with infrastructure, call a mentor in the group chat.

2. Why is my CI job not started?

Usually that happens because your fork has conflicts with a base repository. Resolve them by merging the upstream, or if it all sounds new for you, call a mentor in the group chat.