{"id":6703,"date":"2020-07-01T17:06:31","date_gmt":"2020-07-01T15:06:31","guid":{"rendered":"https:\/\/blog.via-internet.de\/?p=6703"},"modified":"2020-07-01T17:06:31","modified_gmt":"2020-07-01T15:06:31","slug":"azure-databricks-working-with-unit-tests","status":"publish","type":"post","link":"https:\/\/via-internet.de\/blog\/2020\/07\/01\/azure-databricks-working-with-unit-tests\/","title":{"rendered":"Azure Databricks| Working with Unit Tests"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Introduction<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Problem<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Like any other program, Azure Databricks notebooks should be tested automatically to ensure code quality.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Using standard Python Test Tools is not easy because <a href=\"https:\/\/wiki.python.org\/moin\/PythonTestingToolsTaxonomy\" target=\"_blank\" rel=\"noreferrer noopener\">these tools<\/a> are based on Python files in a file system. And a notebook doesn&#8217;t correspond to a Python file.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Solution<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To enable automated testing with <a rel=\"noreferrer noopener\" href=\"http:\/\/pyunit.sourceforge.net\/pyunit.html\" target=\"_blank\">unittest<\/a> (<a rel=\"noreferrer noopener\" href=\"https:\/\/docs.python.org\/3\/library\/unittest.html\" target=\"_blank\">documentation<\/a>),  we proceed as follows:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Create a test class that contains all the tests you want<\/li><li>Execution of all defined tests<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Create Notebook with the Code<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">We will create a simple Notebook for our test.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This notebook will implement a simple calculator, so that we can test the basic calculator operations like add ad multiply.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Create a new <a href=\"https:\/\/databricks-prod-cloudfront.cloud.databricks.com\/public\/4027ec902e239c93eaaa8714f173bcfc\/4113294248817247\/1934597425527841\/7111805527782941\/latest.html\" target=\"_blank\" rel=\"noreferrer noopener\">Notebook<\/a> with the name <em>Calculator<\/em>:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">class Calculator:\n\n\tdef __init__(self, x = 10, y = 8):\n\t\tself.x = x\n\t\tself.y = y\n\t\t\n\tdef add(self, x = None, y = None):\n\t\tif x == None: x = self.x\n\t\tif y == None: y = self.y\t\t\t\n          \n\t\treturn x+y\n\n\tdef subtract(self, x = None, y = None):\n\t\tif x == None: x = self.x\n\t\tif y == None: y = self.y\t\n          \n\t\treturn x-y\n\n\tdef multiply(self, x = None, y = None):\n\t\tif x == None: x = self.x\n\t\tif y == None: y = self.y\t\t\t\n          \n\t\treturn x*y\n\n\tdef divide(self, x = None, y = None):\n\t\tif x == None: x = self.x\n\t\tif y == None: y = self.y\t\t\t\n          \n\t\tif y == 0:\n\t\t\traise ValueError('cannot divide by zero')\n\t\telse:\n\t\t\treturn x\/y<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The notebook should look like this<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/blog.via-internet.de\/wp-content\/uploads\/2020\/07\/21_testclass_creating-700x640.png\" alt=\"\" class=\"wp-image-6718\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">To use this class, write the following lines:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">c = Calculator()\nprint(c.add(20, 10), c.subtract(20, 10), c.multiply(20, 10), c.divide(20, 10))<\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/blog.via-internet.de\/wp-content\/uploads\/2020\/07\/12_class_using-700x164.png\" alt=\"\" class=\"wp-image-6720\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Create Notebook with the Tests<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Create a new <a href=\"https:\/\/databricks-prod-cloudfront.cloud.databricks.com\/public\/4027ec902e239c93eaaa8714f173bcfc\/4113294248817247\/1934597425527846\/7111805527782941\/latest.html\" target=\"_blank\" rel=\"noreferrer noopener\">Notebook <\/a><strong>in the same folder<\/strong> with the name <em>Calculator.Tests<\/em>. <\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p>The name is not important, but it is convenient to name the test program like the program to be tested with the suffix &#8216;Tests&#8217;.<\/p><\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">Create the first command to import the Calculator Notebook<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">\n\n\n\n<p class=\"wp-block-paragraph\">Create the Test Class<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import unittest\n\nclass CalculatorTests(unittest.TestCase):\n  \n  @classmethod\n  def setUpClass(cls):\n    cls.app = Calculator()\n\n  def setUp(self):\n    # print(\"this is setup for every method\")\n    pass\n\n  def test_add(self):\n    self.assertEqual(self.app.add(10,5), 15, )\n\n  def test_subtract(self):\n    self.assertEqual(self.app.subtract(10,5), 5)\n    self.assertNotEqual(self.app.subtract(10,2), 4)\n\n  def test_multiply(self):\n    self.assertEqual(self.app.multiply(10,5), 50)\n\n  def tearDown(self):\n    # print(\"teardown for every method\")\n    pass\n\n  @classmethod\n  def tearDownClass(cls):\n    # print(\"this is teardown class\")\n    pass<\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/i1.wp.com\/blog.via-internet.de\/wp-content\/uploads\/2020\/07\/21_testclass_creating.png?fit=700%2C640&amp;ssl=1\" alt=\"\" class=\"wp-image-6718\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Create the code to run the tests<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">suite =  unittest.TestLoader().loadTestsFromTestCase(CalculatorTests)\nunittest.TextTestRunner(verbosity=2).run(suite)<\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/blog.via-internet.de\/wp-content\/uploads\/2020\/07\/22_testclass_using-700x379.png\" alt=\"\" class=\"wp-image-6717\"\/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Problem Like any other program, Azure Databricks notebooks should be tested automatically to ensure code quality. Using standard Python Test Tools is not easy because these tools are based on Python files in a file system. And a notebook doesn&#8217;t correspond to a Python file. Solution To enable automated testing with unittest (documentation), we proceed as follows: Create a test class that contains all the tests you want Execution of all defined tests Create Notebook with the Code We will create a simple Notebook for our test. This notebook will implement a simple calculator, so that we can test the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":6707,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_crdt_document":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[9,20],"tags":[],"class_list":["post-6703","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-azure","category-databricks"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/posts\/6703","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/comments?post=6703"}],"version-history":[{"count":0,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/posts\/6703\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/via-internet.de\/blog\/wp-json\/"}],"wp:attachment":[{"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/media?parent=6703"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/categories?post=6703"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/tags?post=6703"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}