Spread Knowledge

CS605 - Software Engineering II - Lecture Handout 42

User Rating:  / 0
PoorBest 

Related Content: CS605 - VU Lectures, Handouts, PPT Slides, Assignments, Quizzes, Papers & Books of Software Engineering II

Software Refactoring

Software refactoring is the process of changing a software system such that the external behavior of the system does not change while the internal structure of the system is improved. This is sometimes called “Improving the design after it has been written”.

Fowler defines refactoring as A change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior. It is achieved by applying a series of refactorings without changing its observable behavior.

A (Very) Simple Example

Let us consider a very simple example. The refactoring involved in this case is known as “Consolidate Duplicate Conditional Fragments”. As the name suggests, this refactoring lets the programmer improve the quality of the code by grouping together the duplicate code, resulting in less maintenance afterwards.

if (isSpecialDeal()) {
    total = price * 0.95;
    send ();
}
else {
    total = price * 0.98;
    send ();
}

In this case, send is being called from different places. We can consolidate as follows:

if (isSpecialDeal())
total = price * 0.95;
else
total = price * 0.98;

send ();

It can further be improved if we calculate total outside the if statement as shown below.

if (isSpecialDeal())
factor = 0.95;
else
factor = 0.98;

total = price * factor;
send ();

Although this is a trivial example, it nevertheless is useful and teaches us how can be consolidate code by grouping together pieces of code from different segments.

Refactoring: Where to Start?

The first question that we have to ask ourselves is: how do you identify code that needs to be refactored? Fowler et al has devised a heuristic based approach to this end known as “Bad Smells” in Code. The philosophy is simple: “if it stinks, change it”.

Bad Smells in Code

They have identified many different types of “bad smells”. These are briefly described in the following paragraphs:

    • Duplicated Code
      • bad because if you modify one instance of duplicated code but not the others, you (may) have introduced a bug!
    • Long Method
      • long methods are more difficult to understand; performance concerns with respect to lots of short methods are largely obsolete
    • Large Class
      • Large classes try to do too much, which reduces cohesion
    • Long Parameter List
      • hard to understand, can become inconsistent
    • Divergent Change
      • Deals with cohesion; symptom: one type of change requires changing one subset of methods; another type of change requires changing another subset
    • Shotgun Surgery
      • a change requires lots of little changes in a lot of different classes
    • Feature Envy
      • A method requires lots of information from some other class (move it closer!)
    • Data Clumps
      • attributes that clump together but are not part of the same class
    • Primitive Obsession
      • characterized by a reluctance to use classes instead of primitive data types
    • Switch Statements
      • Switch statements are often duplicated in code; they can typically be replaced by use of polymorphism (let OO do your selection for you!)
    • Parallel Inheritance Hierarchies
      • Similar to Shotgun Surgery; each time I add a subclass to one hierarchy, I need to do it for all  related hierarchies
    • Lazy Class
      • A class that no longer “pays its way”. e.g. may be a class that was downsized by refactoring, or represented planned functionality that did not pan out
    • Speculative Generality
      • “Oh I think we need the ability to do this kind of thing someday”
    • Temporary Field
      • An attribute of an object is only set in certain circumstances; but an object should need all of its attributes
    • Message Chains
      • a client asks an object for another object and then asks that object for another object etc. Bad because client depends on the structure of the navigation
    • Middle Man
      • If a class is delegating more than half of its responsibilities to another class, do you really need it?
    • Inappropriate Intimacy
      • Pairs of classes that know too much about each other’s private details
    • Alternative Classes with Different Interfaces
      • Symptom: Two or more methods do the same thing but have different signature for what they do
    • Incomplete Library Class
      • A framework class doesn’t do everything you need
    • Data Class
      • These are classes that have fields, getting and setting methods for the fields, and nothing else; they are data holders, but objects should be about data AND behavior
    • Refused Bequest
      • A subclass ignores most of the functionality provided by its superclass
    • Comments (!)
      • Comments are sometimes used to hide bad code
      • “…comments often are used as a deodorant” (!)

    Breaking a Method

    We have already seen example of duplicate code. We now look at another simple example of long method. Although, in this case, the code is not really long, it however demonstrates how longer segments of code can be broken into smaller and more manageable (may be more reusable as well) code segments.

    The following code segment sorts an array of integers using “selection sort” algorithm.

    for (i=0; i < N-1; i++) {
    min = i;
    for (j = i; j < N; j++)
    if (a[j] < a[min]) min = j;
    temp = a[i];
    a[i] = a[min];
    a[min] = temp;
    }

    We break it into smaller fragments by making smaller functions out of different steps in the algorithm as follows:

    int minimum (int a[ ], int from, int to)
    {
    int min = from;
    for (int i = from; i <= to; i++)
    if (a[i] < a[min]) min = i;
    return min;
    }

    void swap (int &x, int &y)
    {
    int temp = x;
    x = y;
    y = temp;
    }

    The sort function now becomes simpler as shown below.

    for (i=0; i < N-1; i++) {
    min = minimum (a, i, N-1);
    swap(a[i], a[min]);
    }

    It can be seen that it is now much easier to understand the code and hence is easier to maintain. At the same time we have got two separate reusable functions that can be used elsewhere in the code.

    A slightly more involved example
    (It has mostly been adapted from Fowler’s introduction to refactoring which is freely available on the web)

    Let us consider a simple program for a video store. It has three classes: Movie, Rental, and Customer. Program is told which movies a customer rented and for how long and it then calculates the charges and the Frequent renter points. Customer object can print a statement (in ASCII).

    Here is the code for the classes. DomainObject is a general class that does a few standard things, such as hold a name.

    public class DomainObject {
     
                   public DomainObject (String name)    {
                                   _name = name;
                   };
     
                   public DomainObject ()        {};
     
                   public String name ()            {
                                   return _name;
                   };
     
                   public String toString() {
                                   return _name;
                   };
     
                   protected String _name = "no name";
    }

    Movie represents the notion of a film. A video store might have several tapes in stock of the same movie

    public class Movie extends DomainObject {
        public static final int  CHILDRENS = 2;
        public static final int  REGULAR = 0;
        public static final int  NEW_RELEASE = 1;
    private int _priceCode;
     
                   public Movie(String name, int priceCode) {
                                   _name = name;
                                   _priceCode = priceCode;
                   }
     
                   public int priceCode() {
                                   return _priceCode;
                   }
     
                   public void persist() {
                                   Registrar.add ("Movies", this);
                   }
     
                   public static Movie get(String name) {
                                   return (Movie) Registrar.get ("Movies", name);
                   }
    }

    The movie uses a class called a registrar (not shown) as a class to hold instances of movie. I also do this with other classes. I use the message persist to tell an object to save itself to the registrar. I can then retrieve the object, based on its name, with a get(String) method.
    The tape class represents a physical tape.

    class Tape extends DomainObject
        {
        public Movie movie() {
                   return _movie;
        }
        public Tape(String serialNumber, Movie movie) {
                   _serialNumber = serialNumber;
                   _movie = movie;
        }
        private String _serialNumber;
        private Movie _movie;
        }

    The rental class represents a customer renting a movie.

    class Rental extends DomainObject
        {
        public int daysRented() {
                   return _daysRented;
        }
        public Tape tape() {
                   return _tape;
        }
        private Tape _tape;
        public Rental(Tape tape, int daysRented) {
                   _tape = tape;
                   _daysRented = daysRented;
        }
        private int _daysRented;
        }

    The customer class represents the customer. So far all the classes have been dumb encapsulated data. Customer holds all the behavior for producing a statement in its statement() method.

    class Customer extends DomainObject
        {
        public Customer(String name) {
            _name = name;
        }
        public String statement() {
            double totalAmount = 0;
            int frequentRenterPoints = 0;
            Enumeration rentals = _rentals.elements();
            String result = "Rental Record for " + name() + "\n";
            while (rentals.hasMoreElements()) {
                double thisAmount = 0;
                Rental each = (Rental) rentals.nextElement();
     
                //determine amounts for each line
                switch (each.tape().movie().priceCode()) {
                    case Movie.REGULAR:
                        thisAmount += 2;
                        if (each.daysRented() > 2)
                            thisAmount += (each.daysRented() - 2) * 1.5;
                        break;
                    case Movie.NEW_RELEASE:
                        thisAmount += each.daysRented() * 3;
                        break;
                    case Movie.CHILDRENS:
                        thisAmount += 1.5;
                        if (each.daysRented() > 3)
                            thisAmount += (each.daysRented() - 3) * 1.5;
                        break;
     
                }
                totalAmount += thisAmount;
     
                // add frequent renter points
                frequentRenterPoints ++;
                // add bonus for a two day new release rental
                if ((each.tape().movie().priceCode() == Movie.NEW_RELEASE) && each.daysRented() > 1) frequentRenterPoints ++;
     
                //show figures for this rental
                result += "\t" + each.tape().movie().name()+ "\t" + String.valueOf(thisAmount) + "\n";
     
            }
            //add footer lines
            result +=  "Amount owed is " + String.valueOf(totalAmount) + "\n";
            result += "You earned " + String.valueOf(frequentRenterPoints) + " frequent renter points";
            return result;
     
        }
        public void addRental(Rental arg) {
                   _rentals.addElement(arg);
        }
        public static Customer get(String name) {
                   return (Customer) Registrar.get("Customers", name);
        }
        public void persist() {
                   Registrar.add("Customers", this);
        }
        private Vector _rentals = new Vector();
        }

    What are your impressions about the design of this program? I would describe as not well designed, and certainly not object-oriented. For a simple program is this, that does not really matter. Thereís nothing wrong with a quick and dirty simple program. But if we imagine this as a fragment of a more complex system, then I have some real problems with this program. That long statement routine in the Customer does far too much. Many of the things that it does should really be done by the other classes.
    This is really brought out by a new requirement, just in from the users, they want a similar statement in html. As you look at the code you can see that it is impossible to reuse any of the behavior of the current statement() method for an htmlStatement(). Your only recourse is to write a whole new method that duplicates much of the behavior of statement(). Now of course this is not too onerous. You can just copy the statement() method and make whatever changes you need. So the lack of design does not do too much to hamper the writing of htmlStatement(), (although it might be tricky to figure out exactly where to do the changes). But what happens when the charging rules change? You have to fix both statement() and htmlStatement(), and ensure the fixes are consistent. The problem from cut and pasting code comes when you have to change it later. Thus if you are writing a program that you donít expect to change, then cut and paste is fine. If the program is long lived and likely to change, then cut and paste is a menace.
    But you still have to write the htmlStatement() program. You may feel that you should not touch the existing statement() method, after all it works fine. Remember the old engineering adage: "if it ainít broke, donít fix it". statement() may not be broke, but it does hurt. It is making your life more difficult to write the htmlStatement() method.
    So this is where refactoring comes in. When you find you have to add a feature to a program, and the programís code is not structured in a convenient way to add the feature; then first refactor the program to make it easy to add the feature, then add the feature.

    Extracting the Amount Calculation

    The obvious first target of my attention is the overly long statement() method. When I look at a long method like that, I am looking to take a chunk of the code an extract a method from it.

    Extracting a method is taking the chunk of code and making a method out of it. An obvious piece here is the switch statement (which I'm highlighting below).

    public String statement() {
            double totalAmount = 0;
            int frequentRenterPoints = 0;
            Enumeration rentals = _rentals.elements();
            String result = "Rental Record for " + name() + "\n";
            while (rentals.hasMoreElements()) {
                double thisAmount = 0;
                Rental each = (Rental) rentals.nextElement();
    //determine amounts for each line
                switch (each.tape().movie().priceCode()) {
                    case Movie.REGULAR:
                        thisAmount += 2;
                        if (each.daysRented() > 2)
                            thisAmount += (each.daysRented() - 2) * 1.5;
                        break;
                    case Movie.NEW_RELEASE:
                        thisAmount += each.daysRented() * 3;
                        break;
                    case Movie.CHILDRENS:
                        thisAmount += 1.5;
                        if (each.daysRented() > 3)
                            thisAmount += (each.daysRented() - 3) * 1.5;
                        break;
     
                }
                totalAmount += thisAmount;
     
                // add frequent renter points
                frequentRenterPoints ++;
                // add bonus for a two day new release rental
                if ((each.tape().movie().priceCode() == Movie.NEW_RELEASE) && each.daysRented() > 1) frequentRenterPoints ++;
     
                //show figures for this rental
                result += "\t" + each.tape().movie().name()+ "\t" + String.valueOf(thisAmount) + "\n";
     
            }
            //add footer lines
            result +=  "Amount owed is " + String.valueOf(totalAmount) + "\n";
            result += "You earned " + String.valueOf(frequentRenterPoints) + " frequent renter points";
            return result;
    }
    
    
        

    This looks like it would make a good chunk to extract into its own method. When we extract a method, we need to look in the fragment for any variables that are local in scope to the method we are looking at, that local variables and parameters. This segment of code uses two: each and thisAmount. Of these each is not modified by the code but thisAmount is modified. Any non-modified variable we can pass in as a parameter. Modified variables need more care. If there is only one we can return it. The temp is initialized to 0 each time round the loop, and not altered until the switch gets its hands on it. So we can just assign the result. The extraction looks like this.

    public String statement() {
            double totalAmount = 0;
            int frequentRenterPoints = 0;
            Enumeration rentals = _rentals.elements();
            String result = "Rental Record for " + name() + "\n";
            while (rentals.hasMoreElements()) {
                double thisAmount = 0;
                Rental each = (Rental) rentals.nextElement();
     
                //determine amounts for each line
                thisAmount = amountOf(each);
                totalAmount += thisAmount;
     
                // add frequent renter points
                frequentRenterPoints ++;
                // add bonus for a two day new release rental
                if ((each.tape().movie().priceCode() == Movie.NEW_RELEASE) && each.daysRented() > 1) frequentRenterPoints ++;
     
                //show figures for this rental
                result += "\t" + each.tape().movie().name()+ "\t" + String.valueOf(thisAmount) + "\n";
     
            }
            //add footer lines
            result +=  "Amount owed is " + String.valueOf(totalAmount) + "\n";
            result += "You earned " + String.valueOf(frequentRenterPoints) + " frequent renter points";
            return result;
     
        }
        private int amountOf(Rental each) {
            int thisAmount = 0;
            switch (each.tape().movie().priceCode()) {
                case Movie.REGULAR:
                    thisAmount += 2;
                    if (each.daysRented() > 2)
                        thisAmount += (each.daysRented() - 2) * 1.5;
                    break;
                case Movie.NEW_RELEASE:
                    thisAmount += each.daysRented() * 3;
                    break;
                case Movie.CHILDRENS:
                    thisAmount += 1.5;
                    if (each.daysRented() > 3)
                        thisAmount += (each.daysRented() - 3) * 1.5;
                    break;
     
            }
            return thisAmount;
        }

    When I did this the tests blew up. A couple of the test figures gave me the wrong answer. I was puzzled for a few seconds then realized what I had done. Foolishly I had made the return type of amountOf() int instead of double.

        private double amountOf(Rental each) {
            double thisAmount = 0;
            switch (each.tape().movie().priceCode()) {
                case Movie.REGULAR:
                    thisAmount += 2;
                    if (each.daysRented() > 2)
                        thisAmount += (each.daysRented() - 2) * 1.5;
                    break;
                case Movie.NEW_RELEASE:
                    thisAmount += each.daysRented() * 3;
                    break;
                case Movie.CHILDRENS:
                    thisAmount += 1.5;
                    if (each.daysRented() > 3)
                        thisAmount += (each.daysRented() - 3) * 1.5;
                    break;
     
            }
            return thisAmount;
        }

    Itís the kind of silly mistake that I often make, and it can be a pain to track down as Java converts ints to doubles without complaining (but merrily rounding). Fortunately it was easy to find in this case, because the change was so small. Here is the essence of the refactoring process illustrated by accident. Because each change is so small, any errors are very easy to find. You don't spend long debugging, even if you are as careless as I am.

    This refactoring has taken a large method and broken it down into two much more manageable chunks. We can now consider the chunks a bit better. I don't like some of the variables names in amountOf() and this is a good place to change them.

    private double amountOf(Rental aRental) {
            double result = 0;
            switch (aRental.tape().movie().priceCode()) {
                case Movie.REGULAR:
                    result += 2;
                    if (aRental.daysRented() > 2)
                        result += (aRental.daysRented() - 2) * 1.5;
                    break;
                case Movie.NEW_RELEASE:
                    result += aRental.daysRented() * 3;
                    break;
                case Movie.CHILDRENS:
                    result += 1.5;
                    if (aRental.daysRented() > 3)
                        result += (aRental.daysRented() - 3) * 1.5;
                    break;
            }
            return result;
        }

    Is that renaming worth the effort? Absolutely. Good code should communicate what it is doing clearly, and variable names are key to clear code. Never be afraid to change the names to things to improve clarity. With good find and replace tools, it is usually not difficult. Strong typing and testing will highlight anything you miss. Remember any fool can write code that a computer can understand, good programmers write code that humans can understand.