Developer Blog

Tipps und Tricks für Entwickler und IT-Interessierte

BeautifulSoup | Complete Cheatsheet with Examples

Installation

pip install beautifulsoup4
from bs4 import BeautifulSoup

Creating a BeautifulSoup Object

Parse HTML string:

html = "<p>Example paragraph</p>"
soup = BeautifulSoup(html, 'html.parser')

Parse from file:

with open("index.html") as file:
  soup = BeautifulSoup(file, 'html.parser')

BeautifulSoup Object Types

When parsing documents and navigating the parse trees, you will encounter the following main object types:

Tag

A Tag corresponds to an HTML or XML tag in the original document:

soup = BeautifulSoup('<p>Hello World</p>')
p_tag = soup.p

p_tag.name # 'p'
p_tag.string # 'Hello World'

Tags contain nested Tags and NavigableStrings.

NavigableString

A NavigableString represents text content without tags:

soup = BeautifulSoup('Hello World')
text = soup.string

text # 'Hello World'
type(text) # bs4.element.NavigableString

BeautifulSoup

The BeautifulSoup object represents the parsed document as a whole. It is the root of the tree:

soup = BeautifulSoup('<html>...</html>')

soup.name # '[document]'
soup.head # <head> Tag element

Comment

Comments in HTML are also available as Comment objects:

<!-- This is a comment -->

Copy

comment = soup.find(text=re.compile('This is'))
type(comment) # bs4.element.Comment

Knowing these core object types helps when analyzing, searching, and navigating parsed documents.

Searching the Parse Tree

By Name

HTML:

<div>
  <p>Paragraph 1</p>
  <p>Paragraph 2</p>
</div>

Python:

paragraphs = soup.find_all('p')
# <p>Paragraph 1</p>, <p>Paragraph 2</p>

By Attributes

HTML:

<div id="content">
  <p>Paragraph 1</p>
</div>

Python:Copy

div = soup.find(id="content")
# <div id="content">...</div>

By Text

HTML:

<p>This is some text</p>

Python:

p = soup.find(text="This is some text")
# <p>This is some text</p>

Searching with CSS Selectors

CSS selectors provide a very powerful way to search for elements within a parsed document.

Some examples of CSS selector syntax:

By Tag Name

Select all

tags:

soup.select("p")

By ID

Select element with ID “main”:

soup.select("#main")

By Class Name

Select elements with class “article”:

soup.select(".article")

By Attribute

Select tags with a “data-category” attribute:

soup.select("[data-category]")

Descendant Combinator

Select paragraphs inside divs:

soup.select("div p")

Child Combinator

Select direct children paragraphs:

soup.select("div > p")

Adjacent Sibling

Select h2 after h1:

soup.select("h1 + h2")

General Sibling

Select h2 after any h1:

soup.select("h1 ~ h2")

By Text

Select elements containing text:

soup.select(":contains('Some text')")

By Attribute Value

Select input with type submit:

soup.select("input[type='submit']")

Pseudo-classes

Select first paragraph:

soup.select("p:first-of-type")

Chaining

Select first article paragraph:

soup.select("article > p:nth-of-type(1)")

Accessing Data

HTML:

<p class="content">Some text</p>

Python:

p = soup.find('p')
p.name # "p"
p.attrs # {"class": "content"}
p.string # "Some text"

The Power of find_all()

The find_all() method is one of the most useful and versatile searching methods in BeautifulSoup.

Returns All Matches

find_all() will find and return a list of all matching elements:

all_paras = soup.find_all('p')

This gives you all paragraphs on a page.

Flexible Queries

You can pass a wide range of queries to find_all():Name – find_all(‘p’)Attributes – find_all(‘a’, class_=’external’)Text – find_all(text=re.compile(‘summary’))Limit – find_all(‘p’, limit=2)And more!

Useful Features

Some useful things you can do with find_all():Get a count – len(soup.find_all(‘p’))Iterate through results – for p in soup.find_all(‘p’):Convert to text – [p.get_text() for p in soup.find_all(‘p’)]Extract attributes – [a[‘href’] for a in soup.find_all(‘a’)]

Why It’s Useful

In summary, find_all() is useful because:It returns all matching elementsIt supports diverse and powerful queriesIt enables easily extracting and processing result data

Whenever you need to get a collection of elements from a parsed document, find_all() will likely be your go-to tool.

Navigating Trees

Traverse up and sideways through related elements.

Modifying the Parse Tree

BeautifulSoup provides several methods for editing and modifying the parsed document tree.

HTML:

<p>Original text</p>

Python:

p = soup.find('p')
p.string = "New text"

Edit Tag Names

Change an existing tag name:

tag = soup.find('span')
tag.name = 'div'

Edit Attributes

Add, modify or delete attributes of a tag:

tag['class'] = 'header' # set attribute
tag['id'] = 'main'

del tag['class'] # delete attribute

Edit Text

Change text of a tag:

tag.string = "New text"

Append text to a tag:

tag.append("Additional text")

Insert Tags

Insert a new tag:

new_tag = soup.new_tag("h1")
tag.insert_before(new_tag)

Delete Tags

Remove a tag entirely:

tag.extract()

Wrap/Unwrap Tags

Wrap another tag around:

tag.wrap(soup.new_tag('div))

Unwrap its contents:

tag.unwrap()

Modifying the parse tree is very useful for cleaning up scraped data or extracting the parts you need.

Outputting HTML

Input HTML:

<p>Hello World</p>

Python:

print(soup.prettify())

# <p>
#  Hello World
# </p>

Integrating with Requests

Fetch a page:

import requests

res = requests.get("<https://example.com>")
soup = BeautifulSoup(res.text, 'html.parser')

Parsing Only Parts of a Document

When dealing with large documents, you may want to parse only a fragment rather than the whole thing. BeautifulSoup allows for this using SoupStrainers.

There are a few ways to parse only parts of a document:

By CSS Selector

Parse just a selection matching a CSS selector:

from bs4 import SoupStrainer

only_tables = SoupStrainer("table")
soup = BeautifulSoup(doc, parse_only=only_tables)

This will parse only the tags from the document.

By Tag Name

Parse only specific tags:

only_divs = SoupStrainer("div")
soup = BeautifulSoup(doc, parse_only=only_divs)

By Function

Pass a function to test if a tag should be parsed:

def is_short_string(string):
  return len(string) < 20

only_short_strings = SoupStrainer(string=is_short_string)
soup = BeautifulSoup(doc, parse_only=only_short_strings)

This parses tags based on their text content.

By Attributes

Parse tags that contain specific attributes:

has_data_attr = SoupStrainer(attrs={"data-category": True})
soup = BeautifulSoup(doc, parse_only=has_data_attr)

Multiple Conditions

You can combine multiple strainers:

strainer = SoupStrainer("div", id="main")
soup = BeautifulSoup(doc, parse_only=strainer)

This will parse only

.

Parsing only parts you need can help reduce memory usage and improve performance when scraping large documents.

Dealing with Encoding

When parsing documents, you may encounter encoding issues. Here are some ways to handle encoding:

Specify at Parse Time

Pass the from_encoding parameter when creating the BeautifulSoup object:

soup = BeautifulSoup(doc, from_encoding='utf-8')

This handles any decoding needed when initially parsing the document.

Encode Tag Contents

You can encode the contents of a tag:

tag.string.encode("utf-8")

Use this when outputting tag strings.

Encode Entire Document

To encode the entire BeautifulSoup document:

soup.encode("utf-8")

This returns a byte string with the encoded document.

Pretty Print with Encoding

Specify encoding when pretty printing

print(soup.prettify(encoder="utf-8"))

Unicode Dammit

BeautifulSoup’s UnicodeDammit class can detect and convert incoming documents to Unicode:

from bs4 import UnicodeDammit

dammit = UnicodeDammit(doc)
soup = dammit.unicode_markup

This converts even poorly encoded documents to Unicode.

Properly handling encoding ensures your scraped data is decoded and output correctly when using BeautifulSoup.

Django | Debugging Django-App in VS Code

See here how to configure VS Code:

  • Switch to Run view in VS Code (using the left-side activity bar or F5). You may see the message
    “To customize Run and Debug create a launch.json file”.
    This means that you don’t yet have a launch.json file containing debug configurations. VS Code can create that for you if you click on the create a launch.json file link:Django tutorial: initial view of the debug panel
  • Select the link and VS Code will prompt for a debug configuration. Select Django from the dropdown and VS Code will populate a new launch.json file with a Django run configuration.
    The launch.json file contains a number of debugging configurations, each of which is a separate JSON object within the configuration array.
  • Scroll down to and examine the configuration with the name “Python: Django”:
{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python: Django",
      "type": "python",
      "request": "launch",
      "program": "${workspaceFolder}\\manage.py",
      "args": ["runserver"],
      "django": true,
      "justMyCode": true
    }
  ]
}
  • This configuration tells VS Code to run "${workspaceFolder}/manage.py" using the selected Python interpreter and the arguments in the args list.
    Launching the VS Code debugger with this configuration, then, is the same as running python manage.py runserver in the VS Code Terminal with your activated virtual environment. (You can add a port number like "5000" to args if desired.)
    The "django": true entry also tells VS Code to enable debugging of Django page templates, which you see later in this tutorial.
  • Test the configuration by selecting the Run > Start Debugging menu command, or selecting the green Start Debugging arrow next to the list (F5):Django tutorial: start debugging/continue arrow on the debug toolbar
  • Ctrl+click the http://127.0.0.1:8000/ URL in the terminal output window to open the browser and see that the app is running properly.
  • Close the browser and stop the debugger when you’re finished. To stop the debugger, use the Stop toolbar button (the red square) or the Run > Stop Debugging command (Shift+F5).
  • You can now use the Run > Start Debugging at any time to test the app, which also has the benefit of automatically saving all modified files.

Python | Cookbook

Pip

List all available versions of a packagepip install --use-deprecated=legacy-resolver <module>==
wget -q https://pypi.org/pypi/PyJWT/json -O - | python -m json.tool -

Show Pip Configuration

pip config list

Set Pip Cache Folder

pip config set global.cache-dir D:\Temp\Pip\Cache

Location of packages

pip show <module>

Installation

Update all Python Packages with Powershell

pip freeze |



Update Packages

requirements.txt aktualisieren und alle Versionsnummern als Minimalversionnummer setzen

sed -i '' 's/==/>=/g' requirements.txt
pip install -U -r requirements.txt
pip freeze > requirements.txt
pip install --upgrade --force-reinstall -r requirements.txt
pip install --ignore-installed -r requirements.txt

FastAPI| Arbeiten mit FastAPI

Installation

FastAPI basiert auf den nachfolgenden leistungsfähigen Paketen:

pip install fastapi

Oder die Installation von FastAPI mit allen Komponenten

pip install fastapi[all]
pip install uvicorn[standard]

Arbeiten mit Datenbanken

Alembic

Alembic ist ein leichtgewichtiges Datenbankmigrationstool zur Verwendung mit dem SQLAlchemy Database Toolkit für Python.

pip install alembic
alembic init alembic
alembic list_templates
alembic init --template generic ./scripts

Migrationsskript zur Erstellug der Tabelle ‘account’

alembic revision -m "create account table"

Migrationsskript bearbeiten

def upgrade():
    op.create_table(
        'account',
        sa.Column('id', sa.Integer, primary_key=True),
        sa.Column('name', sa.String(50), nullable=False),
        sa.Column('description', sa.Unicode(200)),
    )

def downgrade():
    op.drop_table('account')

Migration durchführen

alembic upgrade head

Migrationsskript erstellen für das Hinzufügen einer Spalte

alembic revision -m "Add a column"

Migrationsskript bearbeiten

def upgrade():
    op.add_column('account', sa.Column('last_transaction_date', sa.DateTime))

def downgrade():
    op.drop_column('account', 'last_transaction_date')

Migration durchführen

alembic upgrade head

SAS | Migrate from SAS to Python

Introduction

Cookbook

proc freq

proc freq data=mydata;
    tables myvar / nocol nopercent nocum;
run;
mydata.myvar.value_counts().sort_index()

sort by frequency

proc freq order=freq data=mydata;
	tables myvar / nocol nopercent nocum;
run;
mydata.myvar.value_counts()

with missing

proc freq order=freq data=mydata;
    tables myvar / nocol nopercent nocum missing;
run;
mydata.myvar.value_counts(dropna=False)

proc means

proc means data=mydata n mean std min max p25 median p75;
    var myvar;
run;
mydata.myvar.describe()

more percentiles

proc means data=mydata n mean std min max p1 p5 p10 p25 median p75 p90 p95 p99;
	var myvar;
run;
mydata.myvar.describe(percentiles=[.01, .05, .1, .25, .5, .75, .9, .95, .99])

data step

concatenate datasets

data concatenated;
    set mydata1 mydata2;
run;
concatenated = pandas.concat([mydata1, mydata2])

proc contents

proc contents data=mydata;
run;
mydata.info()

save output

proc contents noprint data=mydata out=contents;
run;
contents = mydata.info()  # check this is right

Misc

number of rows in a datastep

* Try this for size: http://www2.sas.com/proceedings/sugi26/p095-26.pdf;
len(mydata)

Django | Cookbook

Installation

Install current Version (3.2.8)

❯ pip install django==3.2.8

Install next Version (4.0)

❯ pip install --pre django

Check installed version

❯ python -m django --version
❯ django-admin.exe version

First steps

The following steps are based on a summary of the Django Tutorial

Create project

django-admin startproject main
cd working_with_django
python manage.py migrate
python manage.py runserver 8080
python manage.py startapp app_base

Create view

Create view in app_base/views.py

from django.http import HttpResponse

def index(request):
    return HttpResponse("Hello, world. You're at the polls index.")

Add view to app_base/urls.py

from django.urls import path
from . import views

urlpatterns = [
    path('', views.index, name='index'),
]

Add urls to project main/urls.py

from django.contrib import admin
from django.urls import include, path

urlpatterns = [
    path('admin/', admin.site.urls),
    path('app_base/', include('app_base.urls')),
]

Create admin user

$ python manage.py createsuperuser
Username (leave blank to use 'user'): admin
Email address: admin@localhost
Password: 
Password (again): 
Superuser created successfully.

Create data and database

Create database model in app_base/models.py

from django.db import models

class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField('date published')

class Choice(models.Model):
    question = models.ForeignKey(Question, on_delete=models.CASCADE)
    choice_text = models.CharField(max_length=200)
    votes = models.IntegerField(default=0)

Activating models in main/settings.py

INSTALLED_APPS = [
    'app_base.apps.AppBaseConfig',

    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
]
$ python manage.py makemigrations app_base
$ python manage.py sqlmigrate app_base 0001

Make app modifiable in the admin (app_base/admin.py)

from django.contrib import admin
from .models import Question

admin.site.register(Question)

Writing more views

Create views in app_base/views.py

def detail(request, question_id):
    return HttpResponse("You're looking at question

def results(request, question_id):
    response = "You're looking at the results of question
    return HttpResponse(response

def vote(request, question_id):
    return HttpResponse("You're voting on question

Add new views into app_base/urls.py

from django.urls import path
from . import views

urlpatterns = [
    path('', views.index, name='index'),

    path('<int:question_id>/', views.detail, name='detail'),
    path('<int:question_id>/results/', views.results, name='results'),
    path('<int:question_id>/vote/', views.vote, name='vote'),
]

Add template in app_base/templates/polls/index.html

    <ul>
    
        <li><a href="/polls/{{ question.id }}/">{{ question.question_text }}</a></li>
    
    </ul>

    <p>No polls are available.</p>




Modify view in app_base/views.py

from django.shortcuts import render
...
def index(request):
    latest_question_list = Question.objects.order_by('-pub_date')[:5]
    context = {'latest_question_list': latest_question_list}
    return render(request, 'polls/index.html', context)

Raising a 404 error in app_base/views.py

from django.http import HttpResponse
from django.shortcuts import render, get_object_or_404

from .models import Question
# ...
def detail(request, question_id):
    question = get_object_or_404(Question, pk=question_id)
    return render(request, 'polls/detail.html', {'question': question})

Create template app_base/templates/polls/detail.html

<h1>{{ question.question_text }}</h1>
<ul>

    <li>{{ choice.choice_text }}</li>

</ul>

Removing hardcoded URLs in app_base/templates/polls/index.html

<li>
   <a href="
</li>

The way this works is by looking up the URL definition as specified in the app_base/urs.py

...
# the 'name' value as called by the 
path('<int:question_id>/', views.detail, name='detail'),
...

Namespacing URL names in app_base/urls.py

app_name = 'app_base'

urlpatterns = [
...

Then, modify link in app_base/templates/polls/index.html

from url ‘detail’ to url ‘app_base:detail’

<li>
    <a href="
</li>

Use generic views: Less code is better

Create class in app_views/views.py

class HomeView(generic.TemplateView):
    template_name = 'index.html'

Create template app_views/templates/index.html

<h1>App Views:</h1>
Welcome

Modify app_views/urls.py

urlpatterns = [
    path('', views.HomeView.as_view(), name='home'),
]

Add another app to main project

Create app

$ python manage.py startapp app_view

Modify main/urls.py

urlpatterns = [
    path('admin/',     admin.site.urls),
    path('app_base/',  include('app_base.urls')),
    path('app_views/', include('app_views.urls')),
]

Add data model in app_views/models.py

from django.db import models

class DataItem(models.Model):
    text = models.CharField(max_length=200)
    data = models.IntegerField(default=0)

    def __str__(self):
        return self.text

Register data in app_views/admin.py

from django.contrib import admin
from .models import DataItem

admin.site.register(DataItem)

Activate models

$ python manage.py makemigrations app_views
$ python manage.py sqlmigrate app_views 0001
$ python manage.py migrate app_views

Navigation / Redirection

Set root page of Django project

When accessing your Django project, the root page will normaly doesn’n show your app homepage.

To change this, you hate to modiy the url handling.

In the following sample, replace <appname> with the name of your app

Define a redirection view in your app (/<appname>/urls.py)

def redirect_to_home(request):
    return redirect('/<appname>')

Define path in the global urls.py (/main/urls.py)

from django.contrib import admin
from django.urls import include, path
from django.shortcuts import redirect

from <appname> import views

urlpatterns = [
    path('',            views.redirect_to_home, name='home'),
    path('<appname>/',  include('<appname>.urls')),
    path('admin/',      admin.site.urls)
]

Highlight current page in navigation menu

<div class="list-group">
    <a href="
            Basic Upload
    </a>
    <a href="
            Progress Bar Upload
    </a>
</div>

Using PostgresSQL Database

Install PostgresSQL

Create Superuser

createuser.exe --interactive --pwprompt

Logging

Additional reading

Tutorials

Testing

Blogs and Posts

Resolving problems

Wrong template is used

The template system is using a search approach to find the specified template file, e.g. ‘home.html’.

If you created more than one apps with the same filenames for templates, the first one will be used.

Change the template folders and add the app name, e.g.

template/
        app_base/
                home.html

Resolving error messages and erors

‘app_name’ is not a registered namespace

One reason for this error is the usage of a namespace in a link.

Back to <a href="



If you want to use this way of links, you have to define the namespace/appname in your <app>/urls.py file

app_name = 'app_views'
urlpatterns = [
    path('', views.HomeView.as_view(), name='home'),
]

dependencies reference nonexistent parent node

  • Recreate database and migration files
  • Remove all migration files under */migrations/00*.py
  • Remove all pycache folders under */__pycache__ and */*/__pycache__
  • Run migration again
$ python manage.py makemigrations
$ python manage migrate

ValueError: Dependency on app with no migrations: customuser

$ python manage.py makemigrations

Project Structure

Running tasks with Makefile

PREFIX_PKG := app

default:
	grep -E ':\s+#' Makefile

clearcache:	# Clear Cache
	python3 manage.py clearcache

run:		# Run Server
	python3 manage.py runserver 8000

deploy:		# Deploy
	rm -rf dist $(PREFIX_PKG)*
	rm -rf polls.dist
	cd polls && python3 setup.py sdist
	mkdir polls.dist && mv polls/dist/* polls/$(PREFIX_PKG)* polls.dist

install_bootstrap:	# Install Bootstrap Library
	cd .. && yarn add bootstrap
	rm -rf  polls/static/bootstrap
	mkdir   polls/static/bootstrap
	cp -R ../node_modules/bootstrap/dist/* polls/static/bootstrap

install_jquery:		# Install jQuery Library
	cd .. && yarn add jquery
	rm -rf polls/static/jquery
	mkdir  polls/static/jquery
	cp ../node_modules/jquery/dist/* polls/static/jquery

install_bootstrap_from_source:	# Install Bootstrap from Source
	mkdir -p install && \
	wget https://github.com/twbs/bootstrap/releases/download/v4.1.3/bootstrap-4.1.3-dist.zip -O install/bootstrap-4.1.3-dist.zip && \
	unzip install/bootstrap-4.1.3-dist.zip -d polls/static/bootstrap/4.1.3

Python | Working with Azure

First Step: Hello World Sample

The following steps at borrowed from the quick start tutorial.

Download sample repository

$ git clone https://github.com/Azure-Samples/python-docs-hello-world
$ cd python-docs-hello-world

Create virtual environment

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
$ export FLASK_APP=application.py
$ flask run

Login zu Azure

$ az login

Deploy to App Service

$ az webapp up --sku F1 -n azure-toolbox-flask-demo -l westeurope
webapp azure-toolbox-flask-demo doesn't exist
Creating Resource group 'xx_xx_Linux_westeurope' ...
Resource group creation complete
Creating AppServicePlan 'xx_asp_Linux_westeurope_0' ...
Creating webapp 'flask-demo' ...
Configuring default logging for the app, if not already enabled
Creating zip with contents of dir .../Working-with_Python ...
Getting scm site credentials for zip deployment
Starting zip deployment. This operation can take a while to complete ...
Deployment endpoint responded with status code 202
You can launch the app at http://via-internet-flask-demo.azurewebsites.net
{
  "URL": "http:/azure-toolbox-flask-demo.azurewebsites.net",
  "appserviceplan": "xx_asp_Linux_westeurope_0",
  "location": "westeurope",
  "name": "azure-toolbox--flask-demo",
  "os": "Linux",
  "resourcegroup": "xx_xx_Linux_westeurope",
  "runtime_version": "python|3.7",
  "runtime_version_detected": "-",
  "sku": "FREE",
  "src_path": ".../Working-with_Python"
}

Create Django App with PostgreSQL

Installation PostgreSQL on Mac OS

$ brew install postgres
==> Installing dependencies for postgresql: krb5
==> Installing postgresql dependency: krb5
...
==> Installing postgresql
...
==> Caveats
==> krb5
krb5 is keg-only, which means it was not symlinked into /usr/local, because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.

If you need to have krb5 first in your PATH run:
  echo 'export PATH="/usr/local/opt/krb5/bin:$PATH"' >> ~/.bash_profile
  echo 'export PATH="/usr/local/opt/krb5/sbin:$PATH"' >> ~/.bash_profile

For compilers to find krb5 you may need to set:
  export LDFLAGS="-L/usr/local/opt/krb5/lib"
  export CPPFLAGS="-I/usr/local/opt/krb5/include"

For pkg-config to find krb5 you may need to set:
  export PKG_CONFIG_PATH="/usr/local/opt/krb5/lib/pkgconfig"

==> postgresql
To migrate existing data from a previous major version of PostgreSQL run:
  brew postgresql-upgrade-database

To have launchd start postgresql now and restart at login:
  brew services start postgresql
Or, if you don't want/need a background service you can just run:
  pg_ctl -D /usr/local/var/postgres start

Set user and passwords for postgres database

Create database and user for django app

$ psql postgres
psql (12.1)
Type "help" for help.

postgres=# CREATE DATABASE pollsdb;
CREATE DATABASE
postgres=# CREATE USER manager WITH PASSWORD '########';
CREATE ROLE
postgres=# GRANT ALL PRIVILEGES ON DATABASE pollsdb TO manager;
GRANT

Download sample repository

$ git clone https://github.com/Azure-Samples/djangoapp.git
$ cd djangoapp

Create virtual environment

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
$ cat env.sh
export DBHOST="localhost"
export DBUSER="manager"
export DBNAME="pollsdb"
export DBPASS="supersecretpass"
$ . env.sh
$ python manage.py  makemigrations
No changes detected
$ python manage.py  migrate
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, polls, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying admin.0002_logentry_remove_auto_add... OK
  Applying admin.0003_logentry_add_action_flag_choices... OK
  Applying contenttypes.0002_remove_content_type_name... OK
  Applying auth.0002_alter_permission_name_max_length... OK
  Applying auth.0003_alter_user_email_max_length... OK
  Applying auth.0004_alter_user_username_opts... OK
  Applying auth.0005_alter_user_last_login_null... OK
  Applying auth.0006_require_contenttypes_0002... OK
  Applying auth.0007_alter_validators_add_error_messages... OK
  Applying auth.0008_alter_user_username_max_length... OK
  Applying auth.0009_alter_user_last_name_max_length... OK
  Applying polls.0001_initial... OK
  Applying sessions.0001_initial... OK
 $ python manage.py createsuperuser
Username (leave blank to use 'user'): admin
Email address: admin@localhost
Password:
Password (again):
Superuser created successfully.

Run server and acccess web page at http://127.0.0.1:8000/

$ python manage.py runserver
Performing system checks...

System check identified no issues (0 silenced).
January 25, 2020 - 16:42:14
Django version 2.1.2, using settings 'azuresite.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
[25/Jan/2020 16:42:26] "GET / HTTP/1.1" 200 111
[25/Jan/2020 16:42:26] "GET /static/polls/style.css HTTP/1.1" 200 27
Not Found: /favicon.ico
[25/Jan/2020 16:42:26] "GET /favicon.ico HTTP/1.1" 404 2688

Login zu Azure

$ az login

Deploy to App Service

$ az webapp up --sku F1 -n azure-toolbox-django-demo -l westeurope
webapp azure-toolbox-django-demo doesn't exist
Creating Resource group 'xx_xx_Linux_westeurope' ...
Resource group creation complete
Creating AppServicePlan 'xx_asp_Linux_westeurope_0' ...
Creating webapp 'flask-demo' ...
Configuring default logging for the app, if not already enabled
Creating zip with contents of dir .../Working-with_Django ...
Getting scm site credentials for zip deployment
Starting zip deployment. This operation can take a while to complete ...
Deployment endpoint responded with status code 202
You can launch the app at http://via-internet-django-demo.azurewebsites.net
{
  "URL": "http:/azure-toolbox-django-demo.azurewebsites.net",
  "appserviceplan": "xx_asp_Linux_westeurope_0",
  "location": "westeurope",
  "name": "azure-toolbox--django-demo",
  "os": "Linux",
  "resourcegroup": "xx_xx_Linux_westeurope",
  "runtime_version": "python|3.7",
  "runtime_version_detected": "-",
  "sku": "FREE",
  "src_path": ".../Working-with_Django"
}

Additional Reading

Installation

Here is the documentation from Microsoft.

Mac OS

Install with Homebrew

$ brew update && brew install azure-cli
$ az login
Test-Driven Development with Python

Python | Test-Driven Development

  • Part 1: Create a TDD Python Project
  • Part 2: Use Jenkins to automatically test your App

Part 1: Create a TDD Python Project

Final source code is on Github.

Introduction

The task of creating an error free program is not easy. And, if your program runs free of errors, keeping it error-free after an update or change is even more complicated. You don’t want to insert new errors or change correct code with wrong parts.

The answer to this situation (directly from the Oracle of Delphi) is: Testing, Testing, Testing

And the best way to test is to start with tests.

This means: think about what the result should be and then create a Test that checks this. Imagine, you have to write a function for adding two values, and you should describe the functionality.

So, maybe, your description contains one or two examples:

My functions add’s two numbers, e.g 5 plus 7 is 12 (or at least should be 12 :))

The procedure with the TDD is:

  • think and define, what the function should to
  • write a stub for the function, e.g. only function parameters and return type
  • write a function, that tests you function with defines parameters and know result

For our example above, this means:

Write the python script with the desired functionality: src/main.py

def add(val1,val2):
    return 0 # this is only a dummy return value

Write the Python Testscript: tst/main.p

def_test_add():
    result = add(5,7)

    if (result = 12):
        print("everything fine")
    else:
        printf("ups, problems with base arithmetics")

Now, with these in your toolbox, you can always verify your code by running the tests.

$ python test_add.py
ups, problems with base arithmetics

dfdf

Setup virtual environment

Mostly, tests are repeated after every change. So, to be sure, that each test is running the same way and with the same environment, we will use pythons virtual environment feature to create a new fresh python environment for the tests.

Create virtual environment

$ python3 -m venv .env/python

Activate environment

Add the following line to .bashrc (or .envrc if you are using direnv)

$ . .env/python/bin/activate

Install required packages

$ pip install pytest

Create a sample Application

Prepare folder

Create folder for sources

$ mkdir src

Create sample package

$ mkdir src/CalculatorLib
$ touch src/CalculatorLib/__init__.py
$ touch src/CalculatorLib/Calculator.py

At least, create a simple Calculator: src/CalculatorLib/Calculator.py

class Calculator:
    def __init__(self):
        print("Init Calculator")

    def add(self, a, b):
        return a + b

    def subtract(self, a, b):
        return a - b

    def multiply(self, a, b):
        return a * b

    def divide(self, a, b):
        return a / b

    def power(self, base, exp):
        return base ** exp

Create the Main App for your Calculator: src/main.py

from CalculatorLib.Calculator import Calculator

class Main(object):

    def run(self):
        c = Calculator()

        print("5 + 3 =
        print("8 - 4 =
        print("5 * 3 =
        print("8 / 4 =

        print("8 ^ 4 =

if __name__ == '__main__':
    Main().run()

Yur done with the fist development step. Try your app:

$ python src/main.py
Init Calculator
5 + 3 =     8
8 - 4 =     4
5 * 3 =    15
8 / 4 =     2
8 ^ 4 =  4096

Add Unit Tests

We will start with our first test. Create folder for tests and a file tst/main.py

$ mkdir tst
$ touch tst/main.py

Use the following for your test script tst/main.py

from CalculatorLib.Calculator import Calculator
import unittest

class CalculatorTest(unittest.TestCase):

    @classmethod
    def setUpClass(self):
        self.c = Calculator()

    def test_add(self):
        self.assertEqual(8, self.c.add(5, 3))

    def test_subtract(self):
        self.assertEqual(4, self.c.subtract(8, 4))

    def test_multiply(self):
        self.assertEqual(32, self.c.multiply(8, 4))

    def test_divide(self):
        self.assertEqual(2, self.c.divide(8, 4))
            
    def test_power(self):
        self.assertEqual(16, self.c.power(2, 4))
                                    
if __name__ == '__main__':
    unittest.main()

Finally try your test script:

$ PYTHONPATH=./src python -m pytest tst/main.py  --verbose
================================= test session starts ================================
platform darwin -- Python 3.7.4, pytest-4.4.1, py-1.8.0, pluggy-0.9.0 -- <Testproject_Python-Calculator/.env/python/bin/python>
cachedir: .pytest_cache
rootdir: <Testproject_Python-Calculator>
plugins: cov-2.6.1
collected 5 items

tst/main.py::CalculatorTest::test_add PASSED             [ 20%]
tst/main.py::CalculatorTest::test_divide PASSED          [ 40%]
tst/main.py::CalculatorTest::test_multiply PASSED        [ 60%]
tst/main.py::CalculatorTest::test_power PASSED           [ 80%]
tst/main.py::CalculatorTest::test_subtract PASSED        [100%]

The command to run the test is python -m pytest tst/main.py, but why the lead Variable PYTHONPATH?

Try it without:

$ python -m pytest tst/main.py
=================================== test session starts ==================================
platform darwin -- Python 3.7.4, pytest-4.4.1, py-1.8.0, pluggy-0.9.0 -- ##/Testproject_Python-Calculator/.env/python/bin/python
cachedir: .pytest_cache
rootdir: ##/Testproject_Python-Calculator
plugins: cov-2.6.1
collected 0 items / 1 errors

========================================= ERRORS =========================================
____________________________________ ERROR collecting tst/main.py ________________________
ImportError while importing test module '##/Testproject_Python-Calculator/tst/main.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tst/main.py:2: in <module>
    from CalculatorLib.Calculator import Calculator
E   ModuleNotFoundError: No module named 'CalculatorLib'
!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!
================================== 1 error in 1.84 secon==================================

Recognize the ModuleNotFoundError in line 16! This means, that Python could not find the desired CalculatorLib.

Look at your folder structure:

$ tree .
.
├── src
│   ├── CalculatorLib
│   │   ├── Calculator.py
│   │   ├── init__.py
│   └── main.py
└── tst
    └── main.py

.

In your Testscript, we import the CalculatorLib whit this statement:

from CalculatorLib.Calculator import Calculator

Python is interpreting this in the following way:

  • Look in the folder of the test script for a subfolder with the name CalculatorLib
  • There, look for a file Calculator.py
  • And in this file, use the class Calculator

Obviously, the folder CalculatorLib is NOT in the same folder as the test script: it is part of the src folder.

So, using the environment variable PYTHONPATH, we inform python where to search python scripts and folders.

Add additional functionality

Add a function at the end of your Calculator: src/CalculatorLib/Calculator.py

    ....
    def factorial(self, n):
        return 0

Add a call of the new function to your main app: src/main.py

    ...
    def run(self):
        ...
        print("4!    =



Add a test for the new function to your test script: tst/main.py

    ...
    def test_factorial(self):
        self.assertEqual(24, self.c.factorial(4))

Try it:

$ python src/main.py
Init Calculator
5 + 3 =     8
8 - 4 =     4
5 * 3 =    15
8 / 4 =     2
8 ^ 4 =  4096
$ PYTHONPATH=./src python -m pytest tst/main.py
==================================== test session starts =====================================
platform darwin -- Python 3.7.4, pytest-4.4.1, py-1.8.0, pluggy-0.9.0
rootdir: ##/Testproject_Python-Calculator
plugins: cov-2.6.1
collected 6 items

tst/main.py ..F...                                                                      [100%]

========================================== FAILURES ==========================================
_______________________________ CalculatorTest.test_factorial ________________________________

self = <main.CalculatorTest testMethod=test_factorial>

    def test_factorial(self):
>       self.assertEqual(24, self.c.factorial(4))
E       AssertionError: 24 != 0

tst/main.py:31: AssertionError
============================= 1 failed, 5 passed in 0.14 seconds =============================

Test failed, was we expect it.

Now, implement the function correctly and startover the test:

Add a function at the end of your Calculator: src/CalculatorLib/Calculator.py

import math

class Calculator:
    ...
    def factorial(self, n):
       if not n >= 0:
            raise ValueError("n must be >= 0")

        if math.floor(n) != n:
            raise ValueError("n must be exact integer")

        if n+1 == n:  # catch a value like 1e300
            raise OverflowError("n too large")

        result, factor = 1, 2
        
        while factor <= n:
            result *= factor
            factor += 1

        return result
$ PYTHONPATH=./src python -m pytest tst/main.py  --verbose
==================================== test session starts =====================================
platform darwin -- Python 3.7.4, pytest-4.4.1, py-1.8.0, pluggy-0.9.0 -- ##/Testproject_Python-Calculator/.env/python/bin/python
cachedir: .pytest_cache
rootdir: ##/Testproject_Python-Calculator
plugins: cov-2.6.1
collected 6 items

tst/main.py::CalculatorTest::test_add PASSED                                             [ 16%]
tst/main.py::CalculatorTest::test_divide PASSED                                          [ 33%]
tst/main.py::CalculatorTest::test_factorial PASSED                                       [ 50%]
tst/main.py::CalculatorTest::test_multiply PASSED                                        [ 66%]
tst/main.py::CalculatorTest::test_power PASSED                                           [ 83%]
tst/main.py::CalculatorTest::test_subtract PASSED                                        [100%]

================================== 6 passed in 0.01 seconds ==================================

Testing Frameworks

https://wiki.python.org/moin/PythonTestingToolsTaxonomy

Unit testing framework

import unittest

class TestStringMethods(unittest.TestCase):

    def test_upper(self):
        self.assertEqual('foo'.upper(), 'FOO')

    def test_isupper(self):
        self.assertTrue('FOO'.isupper())
        self.assertFalse('Foo'.isupper())

    def test_split(self):
        s = 'hello world'
        self.assertEqual(s.split(), ['hello', 'world'])
        
        with self.assertRaises(TypeError):
            s.split(2)

if __name__ == '__main__':
    unittest.main()

pytest – helps you write better programms

# content of test_sample.py
def inc(x):
    return x + 1

def test_answer():
    assert inc(3) == 5
$ pytest

nose – is nicer testing for python

def test_numbers_3_4():
    assert multiply(3,4) == 12 
 
def test_strings_a_3():
    assert multiply('a',3) == 'aaa

Python BDD Pattern

class MangoUseCase(TestCase):
  def setUp(self):
    self.user = 'placeholder'

  @mango.given('I am logged-in')
  def test_profile(self):
    self.given.profile = 'profile'
    self.given.photo = 'photo'

    self.given.notifications = 3
    self.given.notifications_unread = 1

    @mango.when('I click profile')
    def when_click_profile():
      print('click')

      @mango.then('I see profile')
      def then_profile():
        self.assertEqual(self.given.profile, 'profile')

      @mango.then('I see my photo')
        def then_photo():
          self.assertEqual(self.given.photo, 'photo')

radsh is not just another BDD tool …THE ROOT FROM RED TO GREEN

from radish import given, when, then

@given("I have the numbers {number1:g} and {number2:g}")
def have_numbers(step, number1, number2):
    step.context.number1 = number1
    step.context.number2 = number2

@when("I sum them")
def sum_numbers(step):
    step.context.result = step.context.number1 + \
        step.context.number2

@then("I expect the result to be {result:g}")
def expect_result(step, result):
    assert step.context.result == result

doctest

"""
The example module supplies one function, factorial().  For example,

>>> factorial(5)
120
"""

def factorial(n):
    """Return the factorial of n, an exact integer >= 0.

    >>> [factorial(n) for n in range(6)]
    [1, 1, 2, 6, 24, 120]
    >>> factorial(30)
    265252859812191058636308480000000
    >>> factorial(-1)
    Traceback (most recent call last):
        ...
    ValueError: n must be >= 0

    Factorials of floats are OK, but the float must be an exact integer:
    >>> factorial(30.1)
    Traceback (most recent call last):
        ...
    ValueError: n must be exact integer
    >>> factorial(30.0)
    265252859812191058636308480000000

    It must also not be ridiculously large:
    >>> factorial(1e100)
    Traceback (most recent call last):
        ...
    OverflowError: n too large
    """

    import math
    if not n >= 0:
        raise ValueError("n must be >= 0")
    if math.floor(n) != n:
        raise ValueError("n must be exact integer")
    if n+1 == n:  # catch a value like 1e300
        raise OverflowError("n too large")
    result = 1
    factor = 2
    while factor <= n:
        result *= factor
        factor += 1
    return result

if __name__ == "__main__":
    import doctest
    doctest.testmod()

Sample Session with Test Frameworks

$ py.test -v
========================================================= test session starts ==========================================================
platform darwin -- Python 3.7.3, pytest-4.3.1, py-1.8.0, pluggy-0.9.0 -- /CLOUD/Development.Anaconda/anaconda3/bin/python
cachedir: .pytest_cache
rootdir: /CLOUD/Development.Python/Repositories.FromGithub/repositories/python-toolbox/Working-with-TDD/app, inifile:
plugins: remotedata-0.3.1, openfiles-0.3.2, doctestplus-0.3.0, arraydiff-0.3
collected 4 items

test_base.py::test_should_pass PASSED                                                                                            [ 25%]
test_base.py::test_should_raise_error PASSED                                                                                     [ 50%]
test_base.py::test_check_if_true_is_true PASSED                                                                                  [ 75%]
test_base.py::test_check_if_inc_works PASSED
$ nosetests -v
test_base.test_should_pass ... ok
test_base.test_should_raise_error ... ok
test_base.test_check_if_true_is_true ... ok
test_base.test_check_if_inc_works ... ok

----------------------------------------------------------------------
Ran 4 tests in 0.001s

OK

Links and additional information

http://pythontesting.net/

https://www.xenonstack.com/blog/test-driven-development-big-data/

https://realpython.com/python-testing/

Flask | Cookbook

Installation

$ pip install flask
$ flask --version
Python 3.7.3
Flask 1.1.1
Werkzeug 0.15.5

Creating a App

Create base python script app.py

from flask import Flask

app = Flask(__name__)

@app.route('/')
def example():
   return '{"name":"Bob"}'

if __name__ == '__main__':
    app.run()

Start Flask

flask run
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [01/Aug/2019 12:19:00] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [01/Aug/2019 12:19:00] "GET /favicon.ico HTTP/1.1" 404 -