Frequently asked questions
Labs
1. Argument 1 to “get_top_n” has incompatible type “Dict[str, int]”; expected “Dict[str, Union[int, float]]” [arg-type]
This problem frequently occurs in lab_1_keywords_tfidf and is easily fixed.
Typically, this remark is followed by 2 notes:
note: "Dict" is invariant -- see [link]note: Consider using "Mapping" instead, which is covariant in the value type
Although error message may not seem to be particularly clear, it is
rather simple to fix it. To solve the problem it is enough to carefully
follow the task description. In task description, students are required
to demonstrate get_top_n using two dictionaries: the one with
TF-IDF scores and the one with chi-values. Both of those
dictionaries contain float data as values, and such usage does not
cause any problems.
Issues begin when one decides to use get_top_n on frequency
dictionary, which is not required in task description. Frequency
dictionaries have integer values which does not match very well with the
get_top_n typing in this particular lab. This is why it leads to
MyPy complaining.
To fix this problem, only use get_top_n on dictionaries with
float values.
2. Cannot find implementation or library stub for module named “main” [import]
Let’s say the structure of your project looks like this:
+-- 2023-2-level-labs
+-- config
+-- docs
+-- lab_1_keywords_tfidf
+-- assets
+-- tests
+-- main.py
+-- start.py
+-- target_score.txt
+-- README.md
+-- seminars
...
You want to import functions from main.py. To do that, remember that
the checking program looks at your code from the root folder, meaning
that for it the correct name of the main.py would be the following:
lab_1_keywords_tfidf/main.py
This is why to import functions from main.py in your start.py
you need to put it the following way:
from lab_1_keywords_tfidf.main import <functions you want to import>
3. Argument 1 to <function name> Has incompatible type “Optional[<certain type>]”; expected “[<certain type>]”
In some of the laboratory works there is a requirement to check input
data. In other words, apart from main logic of the function, one should
verify that all input arguments are of the expected type, and, for
example, return None otherwise. This is precisely why this MyPy
warning is raised: if in a sequence of two functions the former one can
return None as an indicator of corrupt data, and the latter one does
not expect None among correct input values, there is a risk of
passing data that is obviously incorrect.
To avoid this MyPy remark, it is necessary to check whether the
returned value is not None before proceeding to feed it to the
second function.
For example, let’s say we have the following two functions. The first one unites two lists, and the second one sums all the elements in the list.
def function1(arg1: list[int], arg2: list[int]) -> Optional[list[int]]:
if not arg1 or not arg2:
return None
return arg1 + arg2
def function2(arg: list[int]) -> Optional[int]:
if not arg:
return None
return sum(arg)
We want to use these functions sequentially: firstly we want to unite two lists, and then find its sum. This is an incorrect way to do that:
united_list = function1(list1, list2)
elements_sum = function2(united_list)
function1 can return None, and we must not pass it to
function2. Correct way to check it:
united_list = function1(list1, list2)
if united_list:
elements_sum = function2(united_list)
4. Incompatible types in assignment (expression has type X, variable has type Y)
Python is a dynamically typed programming language, meaning that during
execution of a program in Python same variables can be assigned values
of different types. Although it is not prohibited in the language, it
may still be not the best practice. Reusing variables in such a way can
make your code more vulnerable as there would be a higher probability of
making a mistake that is hard to track. This is why MyPy highlights
such variables: maintaining consistency of typing throughout value
re-assigning should solve this problem.
More about incompatible re-definitions.
More about perks of mypy-style static typing.
5. During working in Visual Studio Code, interpreter cannot be found
In many cases the issue turns out to be wrong opening of the Visual Studio Code.
Make sure that you open the whole 202X-2-level-labs as a project,
not just the folder with a particular lab.
More details on correct Visual Studio Code opening can be found in Подготовка к прохождению курса.
Running tests
1. Why is my CI job cancelled?
Usually that happens because your CI check runs for too long. Possible reasons is that you do not control number of articles that you collect from your seed URL. If you feel that the problem is with infrastructure, call a mentor in the group chat.
2. Why is my CI job not started?
Usually that happens because your fork has conflicts with a base repository. Resolve them by merging the upstream, or if it all sounds new for you, call a mentor in the group chat.
3. Why does my CI or mentor not like my seminar files?
Seminar files are used to teach you some basic knowledge about Python, but they are not part of the laboratory works you submit to the repository. So if you accidentally push them to your Pull Request, your mentor will ask you to remove them or there might be CI errors.
To undo changes in seminar files, you must first find the commit where you changed them. To do so, execute in Visual Studio Code terminal:
git remote -v
You should have two repositories: origin and upstream, one of which is your fork and the other one is the main repository.
If you don’t have an upstream repository, execute:
git remote add upstream <link-to-the-main-repository>
git fetch upstream
Now you need to get the newest state of the main repository via:
git fetch upstream
Once you’ve done that, you need to replace the current state of your local seminar folder with what is available in the main repository.
git checkout upstream/main seminars
The changes will be applied to your current state, and you will be able to add, commit, and push the updated changes as usual.