This lesson is being piloted (Beta version)
If you teach this lesson, please tell the authors and provide feedback by opening an issue in the source repository

Intermediate Research Software Development: Glossary

Key Points

Section 1: Setting Up Environment For Collaborative Code Development
  • In order to develop (write, test, debug, backup) code efficiently, you need to use a number of different tools.

  • When there is a choice of tools for a task you will have to decide which tool is right for you, which may be a matter of personal preference or what the team or community you belong to is using.

Introduction to Our Software Project
  • Programming interfaces define how individual modules within a software application interact among themselves or how the application itself interacts with its users.

  • MVC is a software design architecture which divides the application into three interconnected modules: Model (data), View (user interface), and Controller (input/output and data manipulation).

  • The software project we use throughout this course is an example of an MVC application that manipulates patients’ inflammation data and performs basic statistical analysis using Python.

Collaborative Software Development Using Git and Bitbucket
  • A branch is one version of your project that can contain its own set of commits.

  • Feature branches enable us to develop / explore / test new code features without affecting the stable main code.

Python Code Style Conventions
  • Always assume that someone else will read your code at a later date, including yourself.

  • Community coding conventions help you create more readable software projects that are easier to contribute to.

  • Python Enhancement Proposals (or PEPs) describe a recommended convention or specification for how to do something in Python.

  • Style checking to ensure code conforms to coding conventions is often part of IDEs.

  • Consistency with the style guide is important - whichever style you choose.

Verifying Code Style Using Linters
  • Use linting tools on the command line (or via continuous integration) to automatically check your code style.

Section 2: Ensuring Correctness of Software at Scale
  • Using testing requires us to change our practice of code development, but saves time in the long run by allowing us to more comprehensively and rapidly find faults in code, as well as giving us greater confidence in the correctness of our code.

  • The use of test techniques and infrastructures such as parameterisation and Continuous Integration can help scale and further automate our testing process.

Automatically Testing Software
  • The three main types of automated tests are unit tests, functional tests and regression tests.

  • We can write unit tests to verify that functions generate expected output given a set of specific inputs.

  • It should be easy to add or change tests, understand and run them, and understand their results.

  • We can use a unit testing framework like Pytest to structure and simplify the writing of tests in Python.

  • We should test for expected errors in our code.

  • Testing program behaviour against both valid and invalid inputs is important and is known as data validation.

Scaling Up Unit Testing
  • We can assign multiple inputs to tests using parametrisation.

  • It’s important to understand the coverage of our tests across our code.

  • Writing unit tests takes time, so apply them where it makes the most sense.

Continuous Integration for Automated Testing
  • Continuous Integration can run tests automatically to verify changes as code develops in our repository.

  • CI builds are typically triggered by commits pushed to a repository.

  • We need to write a configuration file to inform a CI service what to do for a build.

  • We can run - and get reports from - different CI infrastructure builds simultaneously.

Section 3: Software Development as a Process
  • Software engineering takes a wider view of software development beyond programming (or coding).

  • Ensuring requirements are sufficiently captured is critical to the success of any project.

  • Following a process makes development predictable, can save time, and helps ensure each stage of development is given sufficient consideration before proceeding to the next.

Software Requirements
  • When writing software used for research, requirements will almost always change.

  • Consider non-functional requirements (how the software will behave) as well as functional requirements (what the software is supposed to do).

  • The environment in which users run our software has an effect on many design choices we might make.

  • Consider the expected longevity of any code before you write it.

  • The perspective and language of a particular requirement stakeholder group should be reflected in requirements for that group.

Software Architecture and Design
  • Planning software projects in advance can save a lot of effort and reduce ‘technical debt’ later - even a partial plan is better than no plan at all.

  • By breaking down our software into components with a single responsibility, we avoid having to rewrite it all when requirements change. Such components can be as small as a single function, or be a software package in their own right.

  • When writing software used for research, requirements will almost always change.

  • ‘Good code is written so that is readable, understandable, covered by automated tests, not over complicated and does well what is intended to do.’

Programming Paradigms
  • A software paradigm describes a way of structuring or reasoning about code.

  • Different programming languages are suited to different paradigms.

  • Different paradigms are suited to solving different classes of problems.

  • A single piece of software will often contain instances of multiple paradigms.

Functional Programming (optional)
  • Functional programming is a programming paradigm where programs are constructed by applying and composing smaller and simple functions into more complex ones (which describe the flow of data within a program as a sequence of data transformations).

  • In functional programming, functions tend to be pure - they do not exhibit side-effects (by not affecting anything other than the value they return or anything outside a function). Functions can also be named, passed as arguments, and returned from other functions, just as any other data type.

  • MapReduce is an instance of a data generation and processing approach, in particular suited for functional programming and handling Big Data within parallel and distributed environments.

  • Python provides comprehensions for lists, dictionaries, sets and generators - a concise (if not strictly functional) way to generate new data from existing data collections while performing sophisticated mapping, filtering and conditional logic on original dataset’s members.

Object Oriented Programming (optional)
  • Object oriented programming is a programming paradigm based on the concept of classes, which encapsulate data and code.

  • Classes allow us to organise data into distinct concepts.

  • By breaking down our data into classes, we can reason about the behaviour of parts of our data.

  • Relationships between concepts can be described using inheritance (is a) and composition (has a).

Architecture Revisited: Extending Software (optional)
  • By breaking down our software into components with a single responsibility, we avoid having to rewrite it all when requirements change. Such components can be as small as a single function, or be a software package in their own right.

Section 4: Collaborative Software Development for Reuse
  • Agreeing on a set of best practices within a software development team will help to improve your software’s understandability, extensibility, testability, reusability and overall sustainability.

Developing Software In a Team: Code Review
  • Code review is a team software quality assurance practice where team members look at parts of the codebase in order to improve their code’s readability, understandability, quality and maintainability.

  • It is important to agree on a set of best practices and establish a code review process in a team to help to sustain a good, stable and maintainable code for many years.

Section 5: Managing and Improving Software Over Its Lifetime
  • For software to succeed it needs to be managed as well as developed.

Assessing Software for Suitability and Improvement
  • It’s as important to have a critical attitude to adopting software as we do to developing it.

  • As a team agree on who and to what extent you will support software you make available to others.

Section 6: Continuous Delivery
  • Continuous Delivery automates the software release process, which enables you to deliver updates faster

Continuous Delivery/Continuous Deployment with BitBucket Pipelines
  • You can configure automated deployment in the bitbucket-pipelines.yml file

Wrap-up
  • Collaborative techniques and tools play an important part of research software development in teams.

Glossary