The claim that "Python is not a typed language" now raises my pulse rate just like "Python is simply a scripting language" which was common many years ago.
One just needs to open a Python console and input 1+"1". The result is not 2 as in PHP, for example, but a TypeError. Python is by all means strongly typed, and also differentiates between mutable and immutable types. Because code in Python is not compiled until run time, the above-mentioned addition error makes itself noticed not during programming, but only during execution.
Table of Contents
Duck typing
Despite strong typing during run time, Python does not enforce specification of variable types during programming. The principle of duck typing is observed here. If a bird makes a sound like "quack", then it is probably a duck, no matter whether a mallard or a harlequin. In the same way, whether a variable comprises an integer, float or double value initially makes no difference if addition or multiplication is needed.
Hungarian notation: Hands off!
The aim to identify variable types already during programming led over 30 years ago to coding standards such as Hungarian notation in whose case the type is prefixed to the variable name. For example, nCustomers is meant to indicate a customer number to the developer. From today's perspective, this curious relic of the previous millennium has naturally been rendered long since obsolete by modern development environments. Nonetheless, I've still actually seen it being used at a few enterprises. This should please stop - it belongs in the museum.
Type hinting in PyCharm
Today, developers are supported by sophisticated IDEs. I began 10 years ago with the Eclipse IDE for Python, switched over eventually to the Wing IDE and finally ended up about 3 years ago at PyCharm from JetBrains. Since availing of the JetBrains package on the computer, I haven't needed any other IDEs, not even for SQL. PyCharm has always offered full support for the latest features of Python, also for type hinting from Python 3.5 onward.
The new type hinting is intended explicitly just as optional support by the Python developers, especially Guido von Rossum. According to PEP484: "It should also be emphasized that Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention."
A simple example
The code shown here comprises a simple example class with a number of functions.
Line 8 shows type hinting of parameters in the constructor by means of a colon after each parameter, followed by the type.
Line 12 shows a method declaration with a return value, i.e. namely using the arrow "->" and type between the parenthesis and colon.
Line 15 shows a method declaration with multiple possible return values. "Union" here must be imported first by the typing module (line 1).
Line 6 shows an attribute definition with type. This notation can be used wherever variables are declared!
What happens if this is not observed?
PyCharm constantly interprets code in real time, prepares type hints and provides the user with indications of possible errors without a need to execute the code.
Code in line 12 is highlighted because PyCharm has noticed that, although an int is desired as a return value, the result of lower() can only be a str.
Highlighted in lines 24 and 26 are my input parameters because their types obviously do not match those in the method definition.
The IDE goes even a step further here and interprets the code according to certain rules. For example, the code below shows how the "base" parameter is handled in the method invocation. Although no type is specified to start with, the code on the next line shows the IDE that multiplication is to be performed with the parameter. In the context of duck typing, it is therefore assumed that what is passed initially does not matter as long the built-in method for multiplication "__mul__" is included.
The fastest way to obtain the method view in the last screenshot, incidentally, is by selecting a method or function with the cursor and simply pressing CTRL+Q. While typing, one can alternatively press CTRL+P once the parenthesis for the parameters is open, whereupon a tool tip with the parameters and the related, defined types appears immediately. As shown next:
Should all one's code be rewritten now?
Hints about minimum standards for code documentation
Of course, that is not necessary. As mentioned above, type hinting only serves for optional support. When executing code, the Python interpreter stonily ignores all type hints because they are intended only for the IDE, and shoves them into a metadata object called "__annotations__".
However, I personally see this is a great opportunity to document code more precisely. Whenever setting a type hint, I immediately provide other users and myself with an indication of how to handle the code optimally in future. Of course, code comments have also always been available, but these have been of little consequence in indicating errors already during development.
Support in work as a data scientist
For me, Python in data science means Pandas and PySpark. I personally always write my code in an editor in PyCharm and then send the code via keyboard shortcut selectively to the interpreter. Here, I constantly have the problem of the auto-complete lagging behind, because the employed methods do not always correctly specify the return values. In Pandas, for example, the editor can easily fail to recognize a data frame as such. The IDE cannot be blamed if the imported module does not deliver the metadata. However, the moment I write "df: DataFrame = ..." instead of simply "df = ..." somewhere, the IDE immediately picks up the thread again and provides me with all details via the auto-complete mechanism.
Outlook
PyCharm does not yet cover all cases of type-hinting with information for the user, i.e. the editor supports all new notations but still behaves somewhat conservatively when it comes to error notifications. One must also bear in mind that the proposals for unambiguous formatting of type hints are already quite old, though support was only introduced from Python 3.5 onward. Because such new Python installations are rarely found at the enterprise level, this issue is specific to individual development environments for the time being. However, developer support can be expected to continue maturing steadily in the case of the quarterly updates of PyCharm.
The typing module's functionality ranges much further than what I have presented here. More information can be found directly in the Python documentation at docs.python.org/3.6/library/typing.html
Who is b.telligent?
Do you want to replace the IoT core with a multi-cloud solution and utilise the benefits of other IoT services from Azure or Amazon Web Services? Then get in touch with us and we will support you in the implementation with our expertise and the b.telligent partner network.
Neural Networks for Tabular Data: Ensemble Learning Without Trees
Neural networks are applied to just about any kind of data (images, audio, text, video, graphs, ...). Only with tabular data, tree-based ensembles like random forests and gradient boosted trees are still much more popular. If you want to replace these successful classics with neural networks, ensemble learning may still be a key idea. This blog post tells you why. It is complemented by a notebook in which you can follow the practical details.
Azure AI Search, Microsoft’s top serverless option for the retrieval part of RAG, has unique sizing, scaling, and pricing logic. While it conceals many complexities of server based solutions, it demands specific knowledge of its configurations.
Polars, the Pandas challenger written in Rust, is much faster, not only in executing the code, but also in development. Pandas has always suffered from an API that "grew historically" in many places. Polars is completely different: it ensures significantly faster development, since its API is designed to be logically consistent from the outset, carefully maintaining stringency with every release (sometimes at the expense of backwards compatibility). Polars can often easily replace Pandas: for example, in Ibis Analytics projects and, of course, for all kinds of daily data preparation tasks. Polars’ superior performance is also helpful in interactive environments like Power BI.