<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Python on KK's Blog (fromkk)</title><link>https://fromkk.com/tags/python/</link><description>Recent content in Python on KK's Blog (fromkk)</description><generator>Hugo</generator><language>en</language><managingEditor>bebound@gmail.com (KK)</managingEditor><webMaster>bebound@gmail.com (KK)</webMaster><lastBuildDate>Mon, 12 Jan 2026 17:36:53 +0800</lastBuildDate><atom:link href="https://fromkk.com/tags/python/index.xml" rel="self" type="application/rss+xml"/><item><title>Namespace Package in Python</title><link>https://fromkk.com/posts/namespace-package-in-python/</link><pubDate>Sun, 10 Aug 2025 18:04:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/namespace-package-in-python/</guid><description>&lt;p&gt;Recently, there is a &lt;a href="https://github.com/Azure/azure-cli/issues/31843#issuecomment-3125269740" target="_blank" rel="noopener noreffer "&gt;GitHub issue&lt;/a&gt; about namespace package in Azure CLI. I think it is a good time to write down the knowledge about namespace package.&lt;/p&gt;
&lt;h2 id="what-is-namespace-package"&gt;What is Namespace Package&lt;/h2&gt;
&lt;p&gt;If several packages share the same root folder, then the root folder is a namespace package. &lt;code&gt;subpackageA&lt;/code&gt; and &lt;code&gt;subpackageb&lt;/code&gt; can be installed separately, even in different Python path, but they can be imported as importing a single package: &lt;code&gt;import root&lt;/code&gt;.&lt;/p&gt;</description></item><item><title>Modern pip build process (–-use-pep517)</title><link>https://fromkk.com/posts/modern-pip-build-process-use-pep517/</link><pubDate>Sun, 24 Nov 2024 20:49:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/modern-pip-build-process-use-pep517/</guid><description>&lt;p&gt;Nowadays, &lt;code&gt;pyproject.toml&lt;/code&gt; becomes the standard configuration file for packaging. Compare with the old &lt;code&gt;setup.py&lt;/code&gt;, it adds two feature pep517 and pep518.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://peps.python.org/pep-0517/" target="_blank" rel="noopener noreffer "&gt;pep517&lt;/a&gt; defines two hooks: &lt;code&gt;build_wheel&lt;/code&gt; and &lt;code&gt;build_sdist&lt;/code&gt;, which is required to build the package from source. Each build backend must implement these two hooks. It makes it possible to create other build backend such as &lt;code&gt;flit&lt;/code&gt; or &lt;code&gt;poetry&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-toml" data-lang="toml"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;[&lt;span style="color:#a6e22e"&gt;build-system&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Defined by PEP 518:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;requires&lt;/span&gt; = [&lt;span style="color:#e6db74"&gt;&amp;#34;flit&amp;#34;&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Defined by this PEP:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;build-backend&lt;/span&gt; = &lt;span style="color:#e6db74"&gt;&amp;#34;local_backend&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#a6e22e"&gt;backend-path&lt;/span&gt; = [&lt;span style="color:#e6db74"&gt;&amp;#34;backend&amp;#34;&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Besides &lt;code&gt;setuptools&lt;/code&gt;, there are some other build back-end such as &lt;code&gt;hatchling&lt;/code&gt; and &lt;code&gt;flit&lt;/code&gt;. You can find the example here: &lt;a href="https://packaging.python.org/en/latest/tutorials/packaging-projects/#choosing-a-build-backend" target="_blank" rel="noopener noreffer "&gt;Python Packaging Uer Guide - Choosing a build backend&lt;/a&gt;&lt;/p&gt;</description></item><item><title>sys.path in Python</title><link>https://fromkk.com/posts/sys-dot-path-in-python/</link><pubDate>Sun, 11 Aug 2024 15:56:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/sys-dot-path-in-python/</guid><description>&lt;p&gt;Here is the process how &lt;code&gt;sys.path&lt;/code&gt; is set in Python, with some parts omitted.&lt;/p&gt;
&lt;h2 id="python-command-line-arguments"&gt;Python Command Line Arguments&lt;/h2&gt;
&lt;p&gt;By default, as initialized upon program startup, a potentially unsafe path is prepended to &lt;code&gt;sys.path&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;python -m&lt;/code&gt;: prepend the current working directory.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;python script.py&lt;/code&gt;: prepend the script’s directory. If it’s a symbolic link, resolve symbolic links.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;python -c&lt;/code&gt; and python (REPL): prepend an empty string, which means the current working directory.&lt;/p&gt;
&lt;p&gt;You can remove these path with &lt;code&gt;-P&lt;/code&gt; param.&lt;/p&gt;</description></item><item><title>__import__ in Python</title><link>https://fromkk.com/posts/import-in-python/</link><pubDate>Sun, 07 Apr 2024 15:58:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/import-in-python/</guid><description>&lt;p&gt;It&amp;rsquo;s known that Python&amp;rsquo;s &lt;code&gt;import&lt;/code&gt; statement is implemented by &lt;code&gt;__import__&lt;/code&gt; function. In general, if we want to import a module dynamically, we can use &lt;code&gt;import_module&lt;/code&gt; function, which is a wrapper around &lt;code&gt;__import__&lt;/code&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The most important difference between these two functions is that import_module() returns the specified package or module (e.g. pkg.mod), while &lt;strong&gt;import&lt;/strong&gt;() returns the top-level package or module (e.g. pkg). &amp;ndash; &lt;a href="https://docs.python.org/3/library/importlib.html#importlib.import_module" target="_blank" rel="noopener noreffer "&gt;https://docs.python.org/3/library/importlib.html#importlib.import_module&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;code&gt;import itertools&lt;/code&gt; and &lt;code&gt;from requests import exceptions&lt;/code&gt; can be translated to:&lt;/p&gt;</description></item><item><title>Python 3.11 changes</title><link>https://fromkk.com/posts/python-3-dot-11-changes/</link><pubDate>Sun, 10 Dec 2023 15:24:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/python-3-dot-11-changes/</guid><description>&lt;p&gt;In &lt;a href="https://github.com/Azure/azure-cli/pull/26923" target="_blank" rel="noopener noreffer "&gt;[Packaging] Support Python 3.11 by bebound · Pull Request #26923 · Azure/azure-cli (github.com)&lt;/a&gt; , I bumped azure-cli to use Python 3.11. We&amp;rsquo;ve bump the dependency in other PRs, I thought it should be a small PR, but in the end, a lot of changes are made.&lt;/p&gt;
&lt;h2 id="args-dot-getargspec"&gt;&lt;code&gt;args.getargspec&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;getargspec&lt;/code&gt; is dropped in 3.11. You can easily replaced it with &lt;a href="https://docs.python.org/3/library/inspect.html#inspect.getfullargspec" target="_blank" rel="noopener noreffer "&gt;&lt;code&gt;getfullargspec&lt;/code&gt;&lt;/a&gt; . It returns &lt;code&gt;FullArgSpec(args, varargs, varkw, defaults, kwonlyargs, kwonlydefaults, annotations)&lt;/code&gt; instead of &lt;code&gt;ArgSpec(args, varargs, keywords, defaults)&lt;/code&gt; So &lt;code&gt;args, _, kw, _ = inspect.getargspec(fn)&lt;/code&gt; can be replaced by &lt;code&gt;args, _, kw, *_ = inspect.getfullargspec(fn)&lt;/code&gt; However, &lt;code&gt;getfullargspec&lt;/code&gt; is retained primarily for use in code that needs to maintain compatibility with the Python 2 &lt;code&gt;inspect&lt;/code&gt; module API.&lt;/p&gt;</description></item><item><title>Memory Leak in Python multiprocessing.Pool</title><link>https://fromkk.com/posts/memory-leak-in-python-multiprocessing-dot-pool/</link><pubDate>Wed, 16 Mar 2022 21:04:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/memory-leak-in-python-multiprocessing-dot-pool/</guid><description>&lt;p&gt;There is a historical memory leak problem in our Django app and I fixed it recently. As time goes by, the memory usage of app keeps growing and so does the CPU usage.&lt;/p&gt;
&lt;figure&gt;&lt;img src="https://fromkk.com/images/pool_before.png"&gt;
&lt;/figure&gt;

&lt;p&gt;After some research, I figure out the cause. Some views does not close &lt;code&gt;multiprocessing.Pool&lt;/code&gt; after using it. The problem disappears when I use &lt;code&gt;Pool&lt;/code&gt; with &lt;code&gt;with&lt;/code&gt; statement.&lt;/p&gt;
&lt;figure&gt;&lt;img src="https://fromkk.com/images/pool_after.png"&gt;
&lt;/figure&gt;

&lt;p&gt;But I&amp;rsquo;m still interested in it and wrote some testing code. The script is run in Python 3.6.8 and produce similar result when using &lt;code&gt;multiprocessing.ThreadPool&lt;/code&gt;.&lt;/p&gt;</description></item><item><title>How to disable auto strip in Charfield in Django</title><link>https://fromkk.com/posts/how-to-disable-auto-strip-in-charfield-in-django/</link><pubDate>Sun, 19 Dec 2021 21:20:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/how-to-disable-auto-strip-in-charfield-in-django/</guid><description>&lt;p&gt;In Django, when edit field in admin page or post data to forms, the leading and tailing whitespace in &lt;code&gt;CharField&lt;/code&gt; and &lt;code&gt;TextField&lt;/code&gt; are removed.&lt;/p&gt;
&lt;p&gt;The reason is &lt;code&gt;strip=True&lt;/code&gt; parameter in &lt;code&gt;forms.CharField&lt;/code&gt;, which is added in Djagno 1.9. You can see the discussion in &lt;a href="https://code.djangoproject.com/ticket/4960" target="_blank" rel="noopener noreffer "&gt;django tiket #4960&lt;/a&gt; and here is &lt;a href="https://github.com/django/django/blob/4ce59f602ed28320caf3035212cb4d1c5430da2b/django/forms/fields.py#L211" target="_blank" rel="noopener noreffer "&gt;source code&lt;/a&gt;. &lt;code&gt;models.CharField&lt;/code&gt; and &lt;code&gt;models.TextField&lt;/code&gt; use &lt;code&gt;formfield()&lt;/code&gt; to create form to interact with user, then both of them eventually create a &lt;code&gt;forms.CharField&lt;/code&gt;&lt;/p&gt;</description></item><item><title>Using JSONField before Django 3.1</title><link>https://fromkk.com/posts/using-jsonfield-before-django-3-dot-1/</link><pubDate>Sat, 11 Sep 2021 21:12:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/using-jsonfield-before-django-3-dot-1/</guid><description>&lt;p&gt;In Django 3.1, Django support save python data into database as JSON encoded data and it is also possible to make query based on field value in JSONField. The detailed usage can be found &lt;a href="https://docs.djangoproject.com/en/3.2/topics/db/queries/#querying-jsonfield" target="_blank" rel="noopener noreffer "&gt;here&lt;/a&gt;. If you are using older version and want to try this feature. Though there are many packages ported this function, I recommend &lt;a href="https://github.com/laymonage/django-jsonfield-backport" target="_blank" rel="noopener noreffer "&gt;django-jsonfield-backport&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="django-jsonfield-backport"&gt;django-jsonfield-backport&lt;/h2&gt;
&lt;p&gt;This package save data as JSON in database and also support JSON query. If your database meet the requirements (MySQL &amp;gt; 5.7, PG &amp;gt; 9.5, MariaDB &amp;gt; 10.2 or SQLite &amp;gt; 3.9 with &lt;a href="https://docs.djangoproject.com/en/3.1/ref/databases/#sqlite-json1" target="_blank" rel="noopener noreffer "&gt;JSON1&lt;/a&gt; extension), you can use JSONField like Django&amp;rsquo;s native implementation.&lt;/p&gt;</description></item><item><title>Using cibuildwheel to Create Python Wheels</title><link>https://fromkk.com/posts/using-cibuildwheel-to-create-python-wheels/</link><pubDate>Wed, 29 Jul 2020 22:53:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/using-cibuildwheel-to-create-python-wheels/</guid><description>&lt;p&gt;Have you ever tried to install &lt;code&gt;MySQL-python&lt;/code&gt;? It contains the C code and need to compile the code while install the package. You have to follow the steps in this articles: &lt;a href="https://ruddra.com/install-mysqlclient-macos/" target="_blank" rel="noopener noreffer "&gt;Install MySQL and MySQLClient(Python) in MacOS&lt;/a&gt;. Things get worse if you are using Windows.&lt;/p&gt;
&lt;p&gt;Luckily, as new distribution format &lt;strong&gt;Wheel&lt;/strong&gt; has been published in &lt;a href="https://www.python.org/dev/peps/pep-0427/" target="_blank" rel="noopener noreffer "&gt;PEP 427&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The wheel binary package format frees installers from having to know about the build system, saves time by amortizing compile time over many installations, and removes the need to install a build system in the target environment.&lt;/p&gt;</description></item><item><title>Import custom package or module in PySpark</title><link>https://fromkk.com/posts/import-custom-package-or-module-in-pyspark/</link><pubDate>Thu, 02 Apr 2020 22:24:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/import-custom-package-or-module-in-pyspark/</guid><description>&lt;p&gt;First zip all of the dependencies into zip file like this. Then you can use one of the following methods to import it.&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code class="language-nil" data-lang="nil"&gt;|-- kk.zip
| |-- kk.py
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="using-py-files-in-spark-submit"&gt;Using &amp;ndash;py-files in spark-submit&lt;/h2&gt;
&lt;p&gt;When submit spark job, add &lt;code&gt;--py-files=kk.zip&lt;/code&gt; parameter. &lt;code&gt;kk.zip&lt;/code&gt; will be distributed with the main scrip file, and &lt;code&gt;kk.zip&lt;/code&gt; will be inserted at the beginning of &lt;code&gt;PATH&lt;/code&gt; environment variable.&lt;/p&gt;
&lt;p&gt;Then you can use &lt;code&gt;import kk&lt;/code&gt; in your main script file.&lt;/p&gt;</description></item><item><title>C3 Linearization and Python MRO(Method Resolution Order)</title><link>https://fromkk.com/posts/c3-linearization-and-python-mro--method-resolution-order/</link><pubDate>Sat, 14 Mar 2020 17:37:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/c3-linearization-and-python-mro--method-resolution-order/</guid><description>&lt;p&gt;Python supports multiple inheritance, its class can be derived from more than one base classes. If the specified attribute or methods was not found in current class, how to decide the search sequence from superclasses? In simple scenario, we know left-to right, bottom to up. But when the inheritance hierarchy become complicated, it&amp;rsquo;s not easy to answer by intuition.&lt;/p&gt;
&lt;p&gt;For instance, what&amp;rsquo;s search sequence of class M?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;class&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;X&lt;/span&gt;:&lt;span style="color:#66d9ef"&gt;pass&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;class&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;Y&lt;/span&gt;: &lt;span style="color:#66d9ef"&gt;pass&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;class&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;Z&lt;/span&gt;:&lt;span style="color:#66d9ef"&gt;pass&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;class&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;A&lt;/span&gt;(X,Y):&lt;span style="color:#66d9ef"&gt;pass&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;class&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;B&lt;/span&gt;(Y,Z):&lt;span style="color:#66d9ef"&gt;pass&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;class&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;M&lt;/span&gt;(B,A,Z):&lt;span style="color:#66d9ef"&gt;pass&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;figure class="image-size-s"&gt;&lt;img src="https://fromkk.com/images/python_mro.png"&gt;
&lt;/figure&gt;

&lt;p&gt;The answer is: &lt;code&gt;M, B, A, X, Y, Z, object&lt;/code&gt;&lt;/p&gt;</description></item><item><title>Torchtext snippets</title><link>https://fromkk.com/posts/torchtext-snippets/</link><pubDate>Mon, 01 Jul 2019 21:28:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/torchtext-snippets/</guid><description>&lt;h2 id="load-separate-files"&gt;Load separate files&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;data.Field&lt;/code&gt; parameters is &lt;a href="https://torchtext.readthedocs.io/en/latest/data.html#torchtext.data.Field" target="_blank" rel="noopener noreffer "&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;When calling &lt;code&gt;build_vocab&lt;/code&gt;, torchtext will add &lt;code&gt;&amp;lt;unk&amp;gt;&lt;/code&gt; in vocabulary list. Set &lt;code&gt;unk_token=None&lt;/code&gt; if you want to remove it. If &lt;code&gt;sequential=True&lt;/code&gt; (default), it will add &lt;code&gt;&amp;lt;pad&amp;gt;&lt;/code&gt; in vocab. &lt;code&gt;&amp;lt;unk&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;pad&amp;gt;&lt;/code&gt; will add at the beginning of vocabulary list by default.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;LabelField&lt;/code&gt; is similar to Field, but it will set &lt;code&gt;sequential=False&lt;/code&gt;, &lt;code&gt;unk_token=None&lt;/code&gt; and &lt;code&gt;is_target=Ture&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;INPUT &lt;span style="color:#f92672"&gt;=&lt;/span&gt; data&lt;span style="color:#f92672"&gt;.&lt;/span&gt;Field(lower&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#66d9ef"&gt;True&lt;/span&gt;, batch_first&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#66d9ef"&gt;True&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;TAG &lt;span style="color:#f92672"&gt;=&lt;/span&gt; data&lt;span style="color:#f92672"&gt;.&lt;/span&gt;LabelField()
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;train, val, test &lt;span style="color:#f92672"&gt;=&lt;/span&gt; data&lt;span style="color:#f92672"&gt;.&lt;/span&gt;TabularDataset&lt;span style="color:#f92672"&gt;.&lt;/span&gt;splits(path&lt;span style="color:#f92672"&gt;=&lt;/span&gt;base_dir&lt;span style="color:#f92672"&gt;.&lt;/span&gt;as_posix(), train&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#39;train_data.csv&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; validation&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#39;val_data.csv&amp;#39;&lt;/span&gt;, test&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#39;test_data.csv&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; format&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#39;tsv&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; fields&lt;span style="color:#f92672"&gt;=&lt;/span&gt;[(&lt;span style="color:#66d9ef"&gt;None&lt;/span&gt;, &lt;span style="color:#66d9ef"&gt;None&lt;/span&gt;), (&lt;span style="color:#e6db74"&gt;&amp;#39;input&amp;#39;&lt;/span&gt;, INPUT), (&lt;span style="color:#e6db74"&gt;&amp;#39;tag&amp;#39;&lt;/span&gt;, TAG)])
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="load-single-file"&gt;Load single file&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;all_data &lt;span style="color:#f92672"&gt;=&lt;/span&gt; data&lt;span style="color:#f92672"&gt;.&lt;/span&gt;TabularDataset(path&lt;span style="color:#f92672"&gt;=&lt;/span&gt;base_dir &lt;span style="color:#f92672"&gt;/&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#39;gossip_train_data.csv&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; format&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#39;tsv&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; fields&lt;span style="color:#f92672"&gt;=&lt;/span&gt;[(&lt;span style="color:#e6db74"&gt;&amp;#39;text&amp;#39;&lt;/span&gt;, TEXT), (&lt;span style="color:#e6db74"&gt;&amp;#39;category&amp;#39;&lt;/span&gt;, CATEGORY)])
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;train, val, test &lt;span style="color:#f92672"&gt;=&lt;/span&gt; all_data&lt;span style="color:#f92672"&gt;.&lt;/span&gt;split([&lt;span style="color:#ae81ff"&gt;0.7&lt;/span&gt;, &lt;span style="color:#ae81ff"&gt;0.2&lt;/span&gt;, &lt;span style="color:#ae81ff"&gt;0.1&lt;/span&gt;])
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="create-iterator"&gt;Create iterator&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;train_iter, val_iter, test_iter &lt;span style="color:#f92672"&gt;=&lt;/span&gt; data&lt;span style="color:#f92672"&gt;.&lt;/span&gt;BucketIterator&lt;span style="color:#f92672"&gt;.&lt;/span&gt;splits(
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; (train, val, test), batch_sizes&lt;span style="color:#f92672"&gt;=&lt;/span&gt;(&lt;span style="color:#ae81ff"&gt;32&lt;/span&gt;, &lt;span style="color:#ae81ff"&gt;256&lt;/span&gt;, &lt;span style="color:#ae81ff"&gt;256&lt;/span&gt;), shuffle&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#66d9ef"&gt;True&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; sort_key&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#66d9ef"&gt;lambda&lt;/span&gt; x: x&lt;span style="color:#f92672"&gt;.&lt;/span&gt;input)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="load-pretrained-vector"&gt;Load pretrained vector&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;vectors &lt;span style="color:#f92672"&gt;=&lt;/span&gt; Vectors(name&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#39;cc.zh.300.vec&amp;#39;&lt;/span&gt;, cache&lt;span style="color:#f92672"&gt;=&lt;/span&gt;&lt;span style="color:#e6db74"&gt;&amp;#39;./&amp;#39;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;INPUT&lt;span style="color:#f92672"&gt;.&lt;/span&gt;build_vocab(train, vectors&lt;span style="color:#f92672"&gt;=&lt;/span&gt;vectors)
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;TAG&lt;span style="color:#f92672"&gt;.&lt;/span&gt;build_vocab(train, val, test)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="check-vocab-sizes"&gt;Check vocab sizes&lt;/h2&gt;
&lt;p&gt;You can view vocab index by &lt;code&gt;vocab.itos&lt;/code&gt;.&lt;/p&gt;</description></item><item><title>Circular Import in Python</title><link>https://fromkk.com/posts/circular-import-in-python/</link><pubDate>Sun, 10 Mar 2019 10:59:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/circular-import-in-python/</guid><description>&lt;p&gt;Recently, I found a really good example code for Python circular import, and I&amp;rsquo;d like to record it here.&lt;/p&gt;
&lt;p&gt;Here is the code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;
&lt;table style="border-spacing:0;padding:0;margin:0;border:0;"&gt;&lt;tr&gt;&lt;td style="vertical-align:top;padding:0;margin:0;border:0;"&gt;
&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;1
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;2
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;3
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;4
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;5
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;6
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;7
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td style="vertical-align:top;padding:0;margin:0;border:0;;width:100%"&gt;
&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python3" data-lang="python3"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# X.py&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;def&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;X1&lt;/span&gt;():
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#66d9ef"&gt;return&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;x1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;from&lt;/span&gt; Y &lt;span style="color:#f92672"&gt;import&lt;/span&gt; Y2
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;def&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;X2&lt;/span&gt;():
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#66d9ef"&gt;return&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;x2&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;div class="highlight"&gt;&lt;div style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;
&lt;table style="border-spacing:0;padding:0;margin:0;border:0;"&gt;&lt;tr&gt;&lt;td style="vertical-align:top;padding:0;margin:0;border:0;"&gt;
&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;1
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;2
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;3
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;4
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;5
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;6
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;7
&lt;/span&gt;&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td style="vertical-align:top;padding:0;margin:0;border:0;;width:100%"&gt;
&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-python3" data-lang="python3"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#75715e"&gt;# Y.py&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;def&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;Y1&lt;/span&gt;():
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#66d9ef"&gt;return&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;y1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#f92672"&gt;from&lt;/span&gt; X &lt;span style="color:#f92672"&gt;import&lt;/span&gt; X1
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;&lt;span style="color:#66d9ef"&gt;def&lt;/span&gt; &lt;span style="color:#a6e22e"&gt;Y2&lt;/span&gt;():
&lt;/span&gt;&lt;/span&gt;&lt;span style="display:flex;"&gt;&lt;span&gt; &lt;span style="color:#66d9ef"&gt;return&lt;/span&gt; &lt;span style="color:#e6db74"&gt;&amp;#34;y2&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Guess what will happen if you run &lt;code&gt;python X.py&lt;/code&gt; and &lt;code&gt;python Y.py&lt;/code&gt;?&lt;/p&gt;</description></item><item><title>Python Dictionary Implementation</title><link>https://fromkk.com/posts/python-dictionary-implementation/</link><pubDate>Sun, 17 Feb 2019 21:48:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/python-dictionary-implementation/</guid><description>&lt;h2 id="overview"&gt;Overview&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;CPython allocation memory to save dictionary, the initial table size is 8, entries are saved as &lt;code&gt;&amp;lt;hash,key,value&amp;gt;&lt;/code&gt; in each slot(The slot content changed after Python 3.6).&lt;/li&gt;
&lt;li&gt;When a new key is added, python use &lt;code&gt;i = hash(key) &amp;amp; mask&lt;/code&gt; where &lt;code&gt;mask=table_size-1&lt;/code&gt; to calculate which slot it should be placed. If the slot is occupied, CPython using a probing algorithm to find the empty slot to store new item.&lt;/li&gt;
&lt;li&gt;When 2/3 of the table is full, the table will be resized.&lt;/li&gt;
&lt;li&gt;When getting item from dictionary, both &lt;code&gt;hash&lt;/code&gt; and &lt;code&gt;key&lt;/code&gt; must be equal.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="resizing"&gt;Resizing&lt;/h2&gt;
&lt;p&gt;When elements size is below 50000, the table size will increase by a factor of 4 based on used slots. Otherwise, it will increase by a factor of 2. The dictionary size is always \(2^{n}\).&lt;/p&gt;</description></item><item><title>CSRF in Django</title><link>https://fromkk.com/posts/csrf-in-django/</link><pubDate>Wed, 07 Nov 2018 13:58:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/csrf-in-django/</guid><description>&lt;p&gt;CSRF(Cross-site request forgery) is a way to generate fake user request to target website. For example, on a malicious website A, there is a button, click it will send request to &lt;a href="https://www.B.com/logout" target="_blank" rel="noopener noreffer "&gt;www.B.com/logout&lt;/a&gt;. When the user click this button, he will logout from website B unconsciously. Logout is not a big problem, but malicious website can generate more dangerous request like money transfer.&lt;/p&gt;
&lt;h2 id="django-csrf-protection"&gt;Django CSRF protection&lt;/h2&gt;
&lt;p&gt;Each web framework has different approach to do CSRF protection. In Django, the validation process is below:&lt;/p&gt;</description></item></channel></rss>