Just moozing

Before you can check your notes, you must make them…


with 2 comments

As stated last week, I will do my backup system properly and I start by doing a script that goes through the files searching for duplicates. In this post I will go through the basic idea and document how I want to implement the system. This step is trivial for experiences programmers with a project of this size, but it is an example of how to start a project.


I will use a directory tree like the following (using a tree one-liner)


The program is called Yarch.

  • The top level directory contains the project files used by geany and the top level unit test file.
  • Documentation is for non-programming files, like the UML documentation (located in Documentation/Yarch_UML).
  • html is for the doxygen generated documentation.
  • tests is for unit testing. It important to keep the unit test code separated from the code that it tests. When I start unit testing the test data will be in a subfolder called data.
  • Yarch contains all the source code. Any modules will be placed in subdirectories.

All data, except html, will be version controlled. Temporary data will also be omitted.


The project has certain functional requirements

  • It must not change the files (in this version at least).
  • It must show a list of files from the secondary directory with the corresponding duplicate in the master directory,
  • The user will be able to choose the two directories to be compared.

I will do the project in python when I get to the programming (this would be a non-functional requirement).

Activity diagram for processing a directory

Activity diagram for processing a directory. Note the recursiveness.

UML diagrams will be used. The standard is set by the Object Modelling Group, and the UML 2.3 specification may be downloaded here. Standards are had to read, and it is difficult for the non-initiates to extract the important parts. I have relied on this book to guide me. I use BOUML to draw the diagram.

The activity diagram shows a the basic way of handling directories and files. The “process file” activity could be more or less anything as long it is to done on every file in the directory tree. I have omitted every reference to objects that will be used. There will be objects used and mutated by the different activities, but I have not decided on the details since it is a design issue.

I have decided to process files first. This is a design decision that have been shown here. It is not important for this script, nor for the tests, but it might get relevant when discussing performance especially for very large directory structures.

Activity diagram for processing a file.

For each file in the directory tree, the script is to the activities showed in the diagram. I have still not decided on how to implement the script. What to implement as classes is decided at the end, but In the activity diagram I see functionality that begs for tests, like testing if a file may in fact be added to a list.

Programming methodology

I have been reading about and practicing Test driven development, and in my opinion it is a very good way of doing development in groups. It is also a very good way when you are on your own and doing iterative development with a long time between the iteration. The tests will help you remember the code, and since it always passes all tests, you will always have a program that is functioning.

The latest addition to my virtual shelf is found here. I am (mostly) doing the Test driven development cluster of pratices that is described in chapter 15.

My iterations (after the initial setup of the project)

  1. Decide on what to program next. This will be a very small step, like adding setX() and getX() to a class, or to create a class.
  2. Write a test to test the unwritten functions or class. I find instantiating a non-existing class to be a good first test.
  3. Run it and watch it fail – if it doesn’t, you have done something wrong.
  4. Implement the function or frame of class
  5. Run tests and see every test pass. Otherwise, fix what failed. Having other tests fail is normal, especially if you a tampering with data classes of widespread use.
  6. If there are some obvious failure modes of the function or class, go to 2) and go through the process of testing them. This is the case of a mutator function and checking that they behave correctly they get parameters they are out-of-range.
  7. Go back to 1.

The textbook way is more strict. It includes that you are only to write the smallest possible code that makes the test pass. That approach (and other TDD related information) may be found here.

As you may have noticed, the script I am about to program is sketchy at best. The important point is that I have defined what it is to do from the costumers point of view. The rest will be decided and defined as I go along. It will be highly dependent on what code I have in stock, examples from the internet and the libraries I decide to use.

The last question is when to do tests. I do test-first when implementing a new feature. It will fail at first (actually, it must fail first), and then when I have implemented the feature it will pass. Whenever I discover a bug, I create a test (and test data) which recreates the bug, and when the bug is fixed the tests passes. In my opinion, test-last break the entire idea of Test driven development.


Written by moozing

July 12, 2010 at 09:00

Posted in Tech

Tagged with ,

2 Responses

Subscribe to comments with RSS.

  1. […] Uncategorized ← Pre-programming […]

  2. […] ← Pre-programming Lenovo Ideapad S12 […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: