Monday, October 10, 2005

Mini HOWTO #1: chroot-ed FTP with wu-ftpd

Scenario: We have an Apache server whose DocumentRoot directory is /var/www/html. We have wu-ftpd running as the FTP server.

Goal: We want developers to be able to access /var/www/html via ftp, but we want to grant access only to that directory and below.

Solution: Set up a chroot-ed ftp environment

1. Create special 'ftpuser' user and group:

useradd ftpuser

2. Change /etc/passwd entry for user ftpuser to:

ftpuser:x:501:501::/var/www/html/.:/sbin/nologin

(note the dot after the chroot directory)

3. Add /sbin/nologin to /etc/shells.

4. Create and set permissions on the following directories under the chroot directory:

cd /var/www/html
mkdir -p bin dev etc usr/lib
chmod 0555 bin dev etc usr/lib

5. Copy ls and more binaries to the bin subdirectory:

cp /bin/ls bin
cp /bin/more bin
chmod 111 bin/ls bin/more

Also copy to usr/lib all libraries needed by ls and more. Do "ldd /bin/ls" to see the shared libraries you need to copy. For example:

-rwxr-xr-x 1 root root 495474 Jan 7 15:44 ld-linux.so.2
-rwxr-xr-x 1 root root 5797952 Jan 7 15:44 libc.so.6
-rwxr-xr-x 1 root root 11832 Jan 7 15:44 libtermcap.so.2

6. Create special "zero" file in dev subdirectory:

cd dev
mknod -m 666 zero c 1 5
chown root.mem zero

7. Create bare-bones passwd and group files in etc subdirectory:

passwd:

root:x:0:0:root:/root:/sbin/nologin
ftpuser:x:501:501::/var/www/html:/sbin/nologin

group:

root:x:0:root
ftpuser:x:501:

8. Edit /etc/ftpaccess and add following lines:

class all real,guest *

guestgroup ftpuser

chmod no guest,anonymous
umask no guest,anonymous
delete no anonymous
overwrite no anonymous
rename no anonymous

upload /var/www/html / yes root ftpuser 0664 dirs

9. Change group (via chgrp) for files under /var/www/html to ftpuser
  • also change permissions to 775 for directories and 664 for files
  • but be careful to exclude the bin, dev, etc and usr subdirectories

10. Modify httpd.conf so that access to special subdirectories is not allowed:

<Directory /var/www/html/bin>
order deny,allow
deny from all
</Directory>

<Directory /var/www/html/dev>
order deny,allow
deny from all
</Directory>

<Directory /var/www/html/etc>
order deny,allow
deny from all
</Directory>

<Directory /var/www/html/usr>
order deny,allow
deny from all
</Directory>



11. Restart Apache and wu-ftpd


12. Test by ftp-ing as user ftpuser
  • Verify that you can upload/delete files in /var/www/html and subdirectories
  • Verify that you can't access files outside of /var/www/html and subdirectories

System administration and security mini HOWTOs

Over the years I kept notes on how to do various sysadmin/security-related tasks. I thought it might be a good idea to post some of them on this blog, both for my own reference and for other folks who might be interested. The first "Mini HOWTO" post will be on setting up a chroot-ed FTP environment with wu-ftpd.

Friday, October 07, 2005

Configuring OpenLDAP as a replacement for NIS

Here's a step-by-step tutorial on installing OpenLDAP on a Red Hat Linux system and configuring it as a replacement for NIS. In a future blog post I intend to cover the python-ldap package.

Install OpenLDAP
# tar xvfz openldap-stable-20050429.tgz
# cd openldap-2.2.26
# ./configure
# make
# make install

Configure and run the OpenLDAP server process slapd
  • In what follows, the LDAP domain is 'myldap'
  • Change the slapd root password:

[root@myhost openldap]# slappasswd
New password:
Re-enter new password:
{SSHA}dYjrA1-JukrfESe/8b1HdZWfcToVE/cC
  • Edit /usr/local/etc/openldap/slapd.conf

    • Change my-domain to myldap

    • Point 'directory' entry to /usr/local/var/openldap-data/myldap

    • Point 'rootpw' entry to line obtained via slappasswd: 'rootpw {SSHA}dYjrA1-JukrfESe/8b1HdZWfcToVE/cC'

    • Add following lines after 'include /usr/local/etc/openldap/schema/core.schema' line:

include         /usr/local/etc/openldap/schema/cosine.schema
include /usr/local/etc/openldap/schema/inetorgperson.schema
include /usr/local/etc/openldap/schema/misc.schema
include /usr/local/etc/openldap/schema/nis.schema
include /usr/local/etc/openldap/schema/openldap.schema
  • Create data directory:

mkdir /usr/local/var/openldap-data/myldap
  • Start up slapd server:

/usr/local/libexec/slapd
  • Test slapd by running an ldap search:

# ldapsearch -x -b '' -s base '(objectclass=*)' namingContexts
# extended LDIF
#
# LDAPv3
# base <> with scope base
# filter: (objectclass=*)
# requesting: namingContexts
#

#
dn:
namingContexts: dc=myldap,dc=com

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

Populate the LDAP database
  • Create /usr/local/var/openldap-data/myldap/myldap.ldif LDIF file:

dn: dc=myldap,dc=com
objectclass: dcObject
objectclass: organization
o: My LDAP Domain
dc: myldap

dn: cn=Manager,dc=myldap,dc=com
objectclass: organizationalRole
cn: Manager
  • Add LDIF contents to the LDAP database via ldapadd:

# ldapadd -x -D "cn=Manager,dc=myldap,dc=com" -W -f myldap.ldif
Enter LDAP Password:
adding new entry "dc=myldap,dc=com"

adding new entry "cn=Manager,dc=myldap,dc=com"
  • Verify that the entries were added by doing an LDAP search:

[root@myhost myldap]# ldapsearch -x -b 'dc=myldap,dc=com' '(objectclass=*)'
# extended LDIF
#
# LDAPv3
# base with scope sub
# filter: (objectclass=*)
# requesting: ALL
#

# myldap.com
dn: dc=myldap,dc=com
objectClass: dcObject
objectClass: organization
o: My LDAP Domain
dc: myldap

# Manager, myldap.com
dn: cn=Manager,dc=myldap,dc=com
objectClass: organizationalRole
cn: Manager

# search result
search: 2
result: 0 Success

# numResponses: 3
# numEntries: 2
  • Sample LDIF file with organizational unit info: /usr/local/var/openldap-data/myldap/myldap_ou.ldif

dn: ou=Sales,dc=myldap,dc=com
ou: Sales
objectClass: top
objectClass: organizationalUnit
description: Members of Sales

dn: ou=Engineering,dc=myldap,dc=com
ou: Engineering
objectClass: top
objectClass: organizationalUnit
description: Members of Engineering
  • Add contents of LDIF file to LDAP database via ldapadd:

[root@myhost myldap]# ldapadd -x -D "cn=Manager,dc=myldap,dc=com" -W -f myldap_ou.ldif
Enter LDAP Password:
adding new entry "ou=Sales,dc=myldap,dc=com"

adding new entry "ou=Engineering,dc=myldap,dc=com"
  • Sample LDIF file with user info: /usr/local/var/openldap-data/myldap/myldap_user.ldif

dn: cn=Larry Fine,ou=Sales,dc=myldap,dc=com
ou: Sales
o: myldap
cn: Larry Fine
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
mail: [email protected]
givenname: Larry
sn: Fine
uid: larry
homePostalAddress: 15 Cherry Ln.$Plano TX 78888
postalAddress: 215 Fitzhugh Ave.
l: Dallas
st: TX
postalcode: 75226
telephoneNumber: (800)555-1212
homePhone: 800-555-1313
facsimileTelephoneNumber: 800-555-1414
userPassword: larrysecret
title: Account Executive
destinationindicator: /bios/images/lfine.jpg

dn: cn=Moe Howard,ou=Sales,dc=myldap,dc=com
ou: Sales
o: myldap
cn: Moe Howard
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
mail: [email protected]
givenname: Moe
sn: Howard
uid: moe
initials: Bob
homePostalAddress: 16 Cherry Ln.$Plano TX 78888
postalAddress: 216 South Fitzhugh Ave.
l: Dallas
st: TX
postalcode: 75226
pager: 800-555-1319
homePhone: 800-555-1313
telephoneNumber: (800)555-1213
mobile: 800-555-1318
title: Manager of Product Development
facsimileTelephoneNumber: 800-555-3318
manager: cn=Larry Howard,ou=Sales,dc=myldap,dc=com
userPassword: moesecret
destinationindicator: /bios/images/mhoward.jpg

dn: cn=Curley Howard,ou=Engineering,dc=myldap,dc=com
ou: Engineering
o: myldap
cn: Curley Howard
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
mail: [email protected]
givenname: Curley
sn: Howard
uid: curley
initials: Joe
homePostalAddress: 14 Cherry Ln.$Plano TX 78888
postalAddress: 2908 Greenville Ave.
l: Dallas
st: TX
postalcode: 75206
pager: 800-555-1319
homePhone: 800-555-1313
telephoneNumber: (800)555-1214
mobile: 800-555-1318
title: Development Engineer
facsimileTelephoneNumber: 800-555-3318
userPassword: curleysecret
destinationindicator: /bios/images/choward.jpg
  • Add contents of LDIF file to LDAP database via ldapadd:

[root@myhost myldap]# ldapadd -x -D "cn=Manager,dc=myldap,dc=com" -W -f myldap_users.ldif
Enter LDAP Password:
adding new entry "cn=Larry Fine,ou=Sales,dc=myldap,dc=com"

adding new entry "cn=Moe Howard,ou=Sales,dc=myldap,dc=com"

adding new entry "cn=Curley Howard,ou=Engineering,dc=myldap,dc=com"
  • Verify entries were added by doing an LDAP search:

[root@myhost myldap]# ldapsearch -x -b 'dc=myldap,dc=com' '(objectclass=*)'
  • Search output should end with:

# search result
search: 2
result: 0 Success

# numResponses: 8
# numEntries: 7

Replace NIS with LDAP

Generate ldif files from /etc/passwd and /etc/group and add them to the LDAP database
  • Generate ldif file for creating 'people' and 'group' organizational units:

  • Edit /usr/local/var/openldap-data/myldap/myldap_people.ldif:

dn: ou=people,dc=myldap,dc=com
objectclass: organizationalUnit
ou: people

dn: ou=group,dc=myldap,dc=com
objectclass: organizationalUnit
ou: group
  • Insert contents of myldap_people.ldif in LDAP database:

# ldapadd -x -D "cn=Manager,dc=myldap,dc=com" -W -f myldap_people.ldif
# tar xvfz MigrationTools.tgz
# cd MigrationTools-46/
  • Edit migrate_common.ph and specify following settings:

$DEFAULT_MAIL_DOMAIN = "myldap.com";
$DEFAULT_BASE = "dc=myldap,dc=com";
$DEFAULT_MAIL_HOST = "mail.myldap.com";
  • Generate passwd.ldif and group.ldif files:

[root@myhost MigrationTools-46]# ./migrate_passwd.pl /etc/passwd /usr/local/var/openldap-data/myldap/myldap_passwd.ldif
[root@myhost MigrationTools-46]# ./migrate_group.pl /etc/group /usr/local/var/openldap-data/myldap/myldap_group.ldif
  • Insert contents of myldap_passwd.ldif and myldap_group.ldif in LDAP database:

# ldapadd -x -D "cn=Manager,dc=myldap,dc=com" -W -f myldap_passwd.ldif
# ldapadd -x -D "cn=Manager,dc=myldap,dc=com" -W -f myldap_group.ldif
Install the pam_ldap and nss_ldap modules
# tar xvfz pam_ldap.tgz
# cd pam_ldap-180
# ./configure
# make
# make install
  • Install nss_ldap.tgz

# tar xvfz nss_ldap.tgz
# cd nss_ldap-243/
# ./configure --enable-rfc2307bis
# make
# make install
  • (See NOTE below before doing this) Edit /etc/ldap.conf (note that there's also /etc/openldap/ldap.conf; you need the one in /etc) and specify the following settings:

base dc=myldap,dc=com
scope sub
timelimit 30
pam_filter objectclass=posixAccount
nss_base_passwd ou=People,dc=myldap,dc=com?one
nss_base_shadow ou=People,dc=myldap,dc=com?one
nss_base_group ou=Group,dc=myldap,dc=com?one
  • (See NOTE below before doing this) Edit /etc/nsswitch.conf and specify:

passwd:     files ldap
shadow: files ldap
group: files ldap

NOTE: Instead of manually modifying /etc/ldap/conf and /etc/nsswitch.conf, you should run the authconfig utility and specify the LDAP server IP and the LDAP base DN ('dc=myldap,dc=com' in our example). authconfig will automatically modify /etc/ldap.conf (minus the nss_base entries), /etc/nsswitch.conf and also /etc/pam.d/system-auth. This is how /etc/pam.d/system-auth looks on a RHEL 4 system after running authconfig:

#%PAM-1.0
# This file is auto-generated.
# User changes will be destroyed the next time authconfig is run.
auth required /lib/security/$ISA/pam_env.so
auth sufficient /lib/security/$ISA/pam_unix.so likeauth nullok
auth sufficient /lib/security/$ISA/pam_ldap.so use_first_pass
auth required /lib/security/$ISA/pam_deny.so

account required /lib/security/$ISA/pam_unix.so broken_shadow
account sufficient /lib/security/$ISA/pam_succeed_if.so uid < default="bad" success="ok" user_unknown="ignore]" retry="3">
Test the LDAP installation with an LDAP-only user

  • Add new user in LDAP database which doesn't exist in /etc/passwd; create /usr/local/var/openldap-data/myldap/myldap_myuser.ldif file:

dn: uid=myuser,ou=People,dc=myldap,dc=com
uid: myuser
cn: myuser
objectClass: account
objectClass: posixAccount
objectClass: top
objectClass: shadowAccount
userPassword: secret
shadowLastChange: 13063
shadowMax: 99999
shadowWarning: 7
loginShell: /bin/bash
uidNumber: 500
gidNumber: 500
homeDirectory: /home/myuser

dn: cn=myuser,ou=Group,dc=myldap,dc=com
objectClass: posixGroup
objectClass: top
cn: myuser
userPassword: {crypt}x
gidNumber: 500
  • Add contents of the myldap_myuser.ldif file to LDAP database via ldapadd:

# ldapadd -x -D "cn=Manager,dc=myldap,dc=com" -W -f myldap_myuser.ldif
  • Create /home/myuser directory and change permissions:

# mkdir /home/myuser
# chown myuser.myuser myuser
  • Change the password for user 'myuser' via ldappasswd:

ldappasswd -x -D "cn=Manager,dc=myldap,dc=com" -W -S "uid=myuser,ou=People,dc=myldap,dc=com"
  • Log in from a remote system via ssh as user myuser; everything should work fine

Adding another host to the myldap LDAP domain
  • On any client machine that you want to join the myldap LDAP domain

    • Make sure the OpenLDAP client package is installed (from source or RPM)

    • Install the nss_ldap and pam_ldap packages

    • Run authconfig and indicate the LDAP server and the LDAP base DN

    • In a terminal console, try to su as user myuser (which doesn't exist locally); it should work

      • To avoid the "home directory not found" message, you'll also need to NFS-mount the home directory of user myuser from the LDAP server

    • Restart sshd and try to ssh from a remote machine as user myuser; it should work (it didn't work in my case until I restarted sshd)


Various notes
  • At this point, you can maintain a central repository of user accounts by adding/deleting/modifying them on the LDAP server machine via various LDAP client utilities such as ldapadd/ldapdelete/ldapmodify
    • For example, to delete user myuser and group myuser, you can run:
# ldapdelete -x -D "cn=Manager,dc=myldap,dc=com" -W 'uid=myuser,ou=People,dc=myldap,dc=com'
# ldapdelete -x -D "cn=Manager,dc=myldap,dc=com" -W 'cn=myuser,ou=Group,dc=myldap,dc=com'
  • I experimented with various ACL entries in slapd.conf in order to allow users to change their own passwords via 'passwd'; however, I was unable to find the proper ACL incantations for doing this (if anybody has a recipe for this, please leave a comment)
  • To properly secure the LDAP communication between clients and the LDAP server, you should enable SSL/TLS (see this HOWTO)
Here are some links I found very useful:

OpenLDAP Quick Start Guide
YoLinux LDAP Tutorial
Linux LDAP Authentication article at linux.com
LDAP for Rocket Scientists -- Open Source guide at zytrax.com
Paranoid Penguin - Authenticate with LDAP part III at linuxjournal.com
Turn your world LDAP-tastic -- blog entry by Ed Dumbill

Thursday, September 29, 2005

HoneyMonkeys: an adventure in black box testing

HoneyMonkeys is the name of a Microsoft research project in computer security. It combines the concept of honeypots with an attitude of "monkey see, monkey do". Specifically, it consists of a cluster of WinXP machines with various configurations (SP1, SP2 non-patched, SP2 partially patched, SP2 fully patched) running as Virtual Machines for easy rollout and reloading.

The XP machines run the IE browser in an automated fashion, pointing it to sites known or suspected for hosting malware. Each machine also runs monitoring software that records every single file and Registry read/write, as well as any attempt to hook malware into Auto-Start Extensibility Points -- for many more details on this see this research report from Microsoft. The machines act as "monkeys" by merely pointing the browser to suspected malicious Web sites and then waiting for a few minutes. The automated IE drivers do not click on any dialog box elements that might prompt for installation of software. Thus, every file that gets created outside the browser's temporary directory, and every Registry write means that malware was installed automatically, without the action of the "user" (i.e. the monkey in this case). When a machine detects that malware was installed, it forwards the URL to a "better" machine (in terms of service packs and patches installed on it) in the cluster. If the URL gets to a fully patched machine and still results in the installation of malware, it means that a zero-day exploit has been found, i.e. an exploit that exists in the wild for which there is no available patch.

As the authors of the research report point out, this approach qualifies as "black-box", since it simply points the browser to various URLs and watches for modifications to the file system, the registry and the memory. A more "white-box" approach would be to attempt to identify malware by trying to match signatures or behaviors against a known list/database. The black-box approach turns out to be much simpler to implement and very effective. The authors report finding the first zero-day exploit using their HoneyMonkeys setup in July 2005.

I think there are a lot of lessons in this stories for us testers:
  • Use Virtual Machine technologies such as VMWare or VirtualPC for easy rollout and reload of multiple OS/software configurations -- when a HoneyMonkey machine is infected with malware, its Virtual Machine image is simply reloaded from a "golden image"
  • Automate, automate, automate -- there is no way "real monkeys" in the shape of humans can click through thousands of URLs in order to find the ones that host malware
  • Apply the KISS principle -- the monkey software is purposely kept simple and stupid; the intelligence resides with the various pieces of monitoring software that watch for modifications to the host machine
  • Don't underestimate black-box techniques -- there is a tendency to relegate black-box techniques to a second-rate status compared to white-box testing; as the HoneyMonkey project demonstrates, sometimes the easier way out is better
For system/security administrators who deal with XP, the bigger lesson is of course to fully patch their machines and instruct their users not to click on popups and other prompts. This is of course easier said than done.

Friday, September 23, 2005

Oblique Strategies and testing

A message posted to comp.lang.python pointed me to a post by Robin Parmar on Oblique Strategies. I had read about this concept before, but I didn't really delve into it, so it was nice to see it mentioned again. The Oblique Strategies are one-line sentences devised by Brian Eno and Peter Schmidt as ways to "jog your mind" and get you unstuck in moments when your creative juices don't flow as much as you would like to. They offer "tangential" solutions to problems, as opposed to the more obvious, and oftentimes futile, "head-on" solutions.

It strikes me that the Oblique Strategies could be an important tool in a tester's arsenal. After all, good testers should be able to "sniff" problems that are not obvious; they should be able to go on "tangents" at any time, to follow their intuition in finding bugs that might be due to subtle interactions. I find it funny that, according to Brian Eno, the very first Oblique Strategy he wrote was "Honour thy error as a hidden intention." Errors, bugs...sounds pretty familiar to me!

I was thrilled when I saw that Robin wrote a Python script that emits a randomly chosen Oblique Strategy every time it's run. I plan on using it regularly to jog my devious tester mind :-)

Here's one strategy that was already printed twice by the script, so I'd better pay attention to it today: Slice into equal pieces. I can't really tell you what that means, I'm not yet done mind-jogging....

Web app testing with Python part 3: twill

In a recent thread on comp.lang.python, somebody was inquiring about ways to test whether a Web site is up or not from within Python code. Some options were proposed, among which I referred the OP to twill, a Web application testing package written in pure Python by Titus Brown (who, I can proudly say, is a fellow SoCal Piggie).

I recently took the latest version of twill for a ride and I'll report here some of my experiences. My application testing scenario was to test a freshly installed instance of Bugzilla. I wanted to see that I can correctly post bugs and retrieve bugs by bug number. Using twill, all this proved to be a snap.

First, a few words about twill: it's a re-implementation of Cory Dodt's PBP package based on the mechanize module written by John J. Lee. Since mechanize implements the HTTP request/response protocol and parses the resulting HTML, we can categorize twill as a "Web protocol driver" tool (for more details on such taxonomies, see a previous post of mine).

Twill can be used as a domain specific language via a command shell (twill-sh), or it can be used as a normal Python module, from within your Python code. I will show both usage models.

After downloading twill and installing it via the usual "python setup.py install" method, you can start its command line interpreter via the twill-sh script installed in /usr/local/bin. At the interpreter prompt, you can then issue commands such as:
  • go -- visit the given URL.
  • code -- assert that the last page loaded had this HTTP status, e.g. code 200 asserts that the page loaded fine.
  • find -- assert that the page contains this regular expression.
  • showforms -- show all of the forms on the page.
  • formvalue --- set the given field in the given form to the given value. For read-only form widgets/controls, the click may be recorded for use by submit, but the value is not changed.
  • submit [] -- click the n'th submit button, if given; otherwise submit via the last submission button clicked; if nothing clicked, use the first submit button on the form.
Let's see a quick example of the twill shell in action. As I mentioned before, I wanted to test a freshly-installed instance of Bugzilla, namely I wanted to verify that I can add new bugs and then retrieve them via their bug number. Here is a shell session fragment that opens the Bugzilla main page via the go command and clicks on the "Enter a new bug report" link via the follow command:

[ggheo@concord twill-latest]$ twill-sh

-= Welcome to twill! =-

current page: *empty page*
>> go http://example.com/bugs/
==> at http://example.com/bugs/
current page: http://example.com/bugs/
>> follow "Enter a new bug report"
==> at http://example.com/bugs/enter_bug.cgi
current page: http://example.com/bugs/enter_bug.cgi

At this point, we can issue the showforms command to see what forms are available on the current page.

>> showforms
Form #1
## __Name______ __Type___ __ID________ __Value__________________
Bugzilla ... text (None)
Bugzilla ... password (None)
product hidden (None) TestProduct
1 GoAheadA ... submit (None) Login
Form #2
## __Name______ __Type___ __ID________ __Value__________________
a hidden (None) reqpw
loginname text (None)
1 submit (None) Submit Request
Form #3
## __Name______ __Type___ __ID________ __Value__________________
id text (None)
1 submit (None) Find
current page: http://example.com/bugs/enter_bug.cgi

It looks like we're on the login page. We can then use the formvalue (or fv for short) command to fill in the required fields (user name and password), then the submit command in order to complete the log in process. The submit command takes an optional argument -- the number of the submit button you want to click. With no arguments, it activates the first submit button it finds.

>> fv 1 Bugzilla_login [email protected]
current page: http://example.com/bugs/enter_bug.cgi
>> fv 1 Bugzilla_password mypassword
current page: http://example.com/bugs/enter_bug.cgi
>> submit 1
current page: http://example.com/bugs/enter_bug.cgi

At this point, we can verify that we received the expected HTTP status code (200 when everything was OK) via the code command:

>> code 200
current page: http://example.com/bugs/enter_bug.cgi

We run showforms again to see what forms and fields are available on the current page, then we use fv to fill in a bunch of fields for the new bug we want to enter, and finally we submit the form (note how nicely twill displays the available fields, as well as the first few selections available in drop-down combo boxes) :

>> showforms
Form #1
## __Name______ __Type___ __ID________ __Value__________________
product hidden (None) TestProduct
version select (None) ['other'] of ['other']
component select (None) ['TestComponent'] of ['TestComponent']
rep_platform select (None) ['Other'] of ['All', 'DEC', 'HP', 'M ...
op_sys select (None) ['other'] of ['All', 'Windows 3.1', ...
priority select (None) ['P2'] of ['P1', 'P2', 'P3', 'P4', 'P5']
bug_severity select (None) ['normal'] of ['blocker', 'critical' ...
bug_status hidden (None) NEW
assigned_to text (None)
cc text (None)
bug_file_loc text (None) http://
short_desc text (None)
comment textarea (None)
form_name hidden (None) enter_bug
1 submit (None) Commit
2 maketemplate submit (None) Remember values as bookmarkable template
Form #2
## __Name______ __Type___ __ID________ __Value__________________
id text (None)
1 submit (None) Find
current page: http://example.com/bugs/enter_bug.cgi
>> fv 1 op_sys "Linux"
current page: http://example.com/bugs/enter_bug.cgi
>> fv 1 priority P1
current page: http://example.com/bugs/enter_bug.cgi
>> fv 1 assigned_to grig@example.com
current page: http://example.com/bugs/enter_bug.cgi
>> fv 1 short_desc "twill-generated bug"
current page: http://example.com/bugs/enter_bug.cgi
>> fv 1 comment "This is a new bug opened automatically via twill"
current page: http://example.com/bugs/enter_bug.cgi
>> submit
Note: submit is using submit button: name="None", value=" Commit "
current page: http://example.com/bugs/post_bug.cgi

Now we can verify that the bug with the specified description was posted. We use the find command, which takes a regular expression as an argument:

>> find "Bug \d+ Submitted"
current page: http://example.com/bugs/post_bug.cgi
>> find "twill-generated bug"
current page: http://example.com/bugs/post_bug.cgi

No errors were reported, which means the validations succeeded. At this point, we can also inspect the current page via the show_html command in order to see the bug number that Bugzilla automatically assigned. I won't actually show all the HTML, suffice to say that the bug was assigned number 2. We can then go directly to the page for bug #2 and verify that the various bug elements we indicated were indeed posted correctly:

>> go "http://example.com/bugs/show_bug.cgi?id=2"
==> at http://example.com/bugs/show_bug.cgi?id=2
current page: http://example.com/bugs/show_bug.cgi?id=2
>> find "Linux"
current page: http://example.com/bugs/show_bug.cgi?id=2
>> find "P1"
current page: http://example.com/bugs/show_bug.cgi?id=2
>> find "grig@example.com"
current page: http://example.com/bugs/show_bug.cgi?id=2
>> find "twill-generated bug"
current page: http://example.com/bugs/show_bug.cgi?id=2
>> find "This is a new bug opened automatically via twill"
current page: http://example.com/bugs/show_bug.cgi?id=2

I mentioned that all the commands available in the interactive twill-sh command interpreter are also available as top-level functions to be used inside your Python code. All you need to do is import the necessary functions from the twill.commands module.

Here's how a Python script that tests functionality similar to the one I described above would look like:

#!/usr/bin/env python

from twill.commands import go, follow, showforms, fv, submit, find, code, save_html
import os, time, re

def get_bug_number(html_file):
h = open(html_file)
bug_number = "-1"
for line in h:
s = re.search("Bug (\d+) Submitted", line)
if s:
bug_number = s.group(1)
break
return bug_number

# MAIN
crt_time = time.strftime("%Y%m%d%H%M%S", time.localtime())
temp_html = "temp.html"

# Open a new bug report
go("http://www.
example.com/bugs")
follow("Enter a new bug report")

# Log in
fv("1", "Bugzilla_login", "grig@
example.com")
fv("1", "Bugzilla_password", "mypassword")
submit()
code("200")

# Enter bug info
fv("1", "op_sys", "Linux")
fv("1", "priority", "P1")
fv("1", "assigned_to", "grig@example
.com")
fv("1", "short_desc", "twill-generated bug at " + crt_time)
fv("1", "comment", "This is a new bug opened automatically via twill at " + crt_time)
submit()
code("200")

# Verify bug info
find("Bug \d+ Submitted")
find("twill-generated bug at " + crt_time)

# Get bug number
save_html(temp_html)
bug_number = get_bug_number(temp_html)
os.unlink(temp_html)

assert bug_number != "-1"

# Go to bug page and verify more detauled info
go("http://example.com/bugs/show_bug.cgi?id=" + bug_number)
code("200")
find("P1")
find("Linux")
find("[email protected]")
find("This is a new bug opened automatically via twill at " + crt_time)

I added some extra functionality to the Python script -- such as adding the current time to the bug description, so that whenever the test script will be run, a different bug description will be inserted into the Bugzilla database (the current time doesn't of course guarantee uniqueness, but it will do for now :-) I also used the save_html function in order to save the "Bug posted" page to a temporary file, so that I can retrieve the bug number and query the individual bug page.

Conclusion

Twill is an excellent tool for testing Web applications. It can also be used to automate form handling, especially for Web sites that require a login. I especially like the fact that everything can be run from the command line -- both the twill shell and the Python scripts based on twill. This means that deploying twill is a snap, and there are no cumbersome GUIs to worry about. The assertion commands built into twill (code, find and notfind) should be enough for testing Web sites that use straight HTML and forms. For more complicated, Javascript-intensive Web sites, a tool such as Selenium might be more appropriate.

I haven't looked into twill's cookie-handling capabilities, but they're available, according to the README. Some more aspects of twill that I haven't experimented with yet:
  • Script recording: Titus has written a maxq add-on that can be used to automatically record twill-based scripts while browsing the Web site under test; for more details on maxq, see also a previous post of mine
  • Extending twill: you can easily add commands to the twill interpreter
Kudos to Titus for writing a powerful, yet easy to use testing tool.

Friday, September 16, 2005

CherryPy, Cheetah and PEAK on IBM developerWorks

I haven't read these articles yet and I wanted to have the links in one place for future reference:

Monday, September 12, 2005

Jakob Nielsen on Usability Testing

Do you spend one day per week on observing how new users interact with your product? In fact, do you have any usability testing at all in your budget? In the rare event that you do run usability testing sessions, do you focus on actual user behavior (and not waste time by having users fill in endless questionnaires)? If you tend to answer "no" to these questions, read Jakob Nielsen's article on how to properly conduct usability testing sessions.

Running a Python script as a Windows service

This is a message I posted to comp.lang.py regarding ways to run a regular Python script as a Windows service.

I will assume you want to turn a script called myscript.py into a service.

1. Install Win2K Resource Kit (or copy the 2 binaries instsrv.exe and srvany.exe).

2. Run instsrv to install srvany.exe as a service with the name myscript:
C:\Program Files\Resource Kit\instsrv myscript "C:\Program Files\Resource Kit\srvany.exe"

3. Go to Computer Management->Services and make sure myscript is listed as a service. Also make sure the Startup Type is Automatic.

4. Create a myscript.bat file with the following contents in e.g. C:\pyscripts:

C:\Python23\python C:\pyscripts\myscript.py

(replace Python23 with your Python version)

5. Create new registry entries for the new service.
  • run regedt32 and go to the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\myscript entry
  • add new key (Edit->Add Key) called Parameters
  • add new entry for Parameters key (Edit->Add Value) to set the Application name
    • Name should be Application
    • Type should be REG_SZ
    • Value should be path to myscript.bat, i.e. C:\pyscripts\myscript.bat
  • add new entry for Parameters key (Edit->Add Value) to set the working directory
    • Name should be AppDir
    • Type should be REG_SZ
    • Value should be path to pyscripts directory, i.e. C:\pyscripts
6. Test starting and stopping the myscript service in Computer
Management->Services.

Michael Feathers on Unit Testing Rules

Short but high-impact post from Michael Feathers (of "Working Effectively with Legacy Code" fame). His main recommendation is to have unit tests that do not interact with the OS or other applications. Interactions to avoid include databases, sockets, even file systems. When you have a set of unit tests that run in isolation (and thus run very quickly), and when you have other sets of tests that do exercise all the interactions above, you are in a good position to quickly pinpoint who the culprit is when a test fails.

Friday, September 02, 2005

Recommended site: QA Podcast

I got an email from Darren Barefoot pointing me to a site he helped put together: QA Podcast. Very interesting stuff: interviews/conversations about software testing with folks who care and have something to say on this subject. I was glad to see that the podcasts published so far cover subjects such as performance testing and exploratory testing. I listened so far to a conversation on exploratory testing with James Bach and I already took away a ton of ideas I can apply in my testing activities.

Friday, August 19, 2005

Managing DNS zone files with dnspython

I've been using dnspython lately for transferring some DNS zone files from one name server to another. I found the package extremely useful, but poorly documented, so I decided to write this post as a mini-tutorial on using dnspython.

Running DNS queries

This is one of the things that's clearly spelled out on the Examples page. Here's how to run a DNS query to get the mail servers (MX records) for dnspython.org:

import dns.resolver

answers = dns.resolver.query('dnspython.org', 'MX')
for rdata in answers:
print 'Host', rdata.exchange, 'has preference', rdata.preference
To run other types of queries, for example for IP addresses (A records) or name servers (NS records), replace MX with the desired record type (A, NS, etc.)

Reading a DNS zone from a file

In dnspython, a DNS zone is available as a Zone object. Assume you have the following DNS zone file called db.example.com:

$TTL 36000
example.com. IN SOA ns1.example.com. hostmaster.example.com. (
2005081201 ; serial
28800 ; refresh (8 hours)
1800 ; retry (30 mins)
2592000 ; expire (30 days)
86400 ) ; minimum (1 day)

example.com. 86400 NS ns1.example.com.
example.com. 86400 NS ns2.example.com.
example.com. 86400 MX 10 mail.example.com.
example.com. 86400 MX 20 mail2.example.com.
example.com. 86400 A 192.168.10.10
ns1.example.com. 86400 A 192.168.1.10
ns2.example.com. 86400 A 192.168.1.20
mail.example.com. 86400 A 192.168.2.10
mail2.example.com. 86400 A 192.168.2.20
www2.example.com. 86400 A 192.168.10.20
www.example.com. 86400 CNAME example.com.
ftp.example.com. 86400 CNAME example.com.
webmail.example.com. 86400 CNAME example.com.

To have dnspython read this file into a Zone object, you can use this code:

import dns.zone
from dns.exception import DNSException

domain = "example.com"
print "Getting zone object for domain", domain
zone_file = "db.%s" % domain

try:
zone = dns.zone.from_file(zone_file, domain)
print "Zone origin:", zone.origin
except DNSException, e:
print e.__class__, e
A zone can be viewed as a dictionary mapping names to nodes; dnspython uses by default name representations which are relative to the 'origin' of the zone. In our zone file, 'example.com' is the origin of the zone, and it gets the special name '@'. A name such as www.example.com is exposed by default as 'www'.

A name corresponds to a node, and a node contains a collection of record dataset, or rdatasets. A record dataset contains all the records of a given type. In our example, the '@' node corresponding to the zone origin contains 4 rdatasets, one for each record type that we have: SOA, NS, MX and A. The NS rdataset contains a set of rdatas, which are the individual records of type NS. The rdata class has subclasses for all the possible record types, and each subclass contains information specific to that record type.

Enough talking, here is some code that will hopefully make the previous discussion a bit clearer:

import dns.zone
from dns.exception import DNSException
from dns.rdataclass import *
from dns.rdatatype import *

domain = "example.com"
print "Getting zone object for domain", domain
zone_file = "db.%s" % domain

try:
zone = dns.zone.from_file(zone_file, domain)
print "Zone origin:", zone.origin
for name, node in zone.nodes.items():
rdatasets = node.rdatasets
print "\n**** BEGIN NODE ****"
print "node name:", name
for rdataset in rdatasets:
print "--- BEGIN RDATASET ---"
print "rdataset string representation:", rdataset
print "rdataset rdclass:", rdataset.rdclass
print "rdataset rdtype:", rdataset.rdtype
print "rdataset ttl:", rdataset.ttl
print "rdataset has following rdata:"
for rdata in rdataset:
print "-- BEGIN RDATA --"
print "rdata string representation:", rdata
if rdataset.rdtype == SOA:
print "** SOA-specific rdata **"
print "expire:", rdata.expire
print "minimum:", rdata.minimum
print "mname:", rdata.mname
print "refresh:", rdata.refresh
print "retry:", rdata.retry
print "rname:", rdata.rname
print "serial:", rdata.serial
if rdataset.rdtype == MX:
print "** MX-specific rdata **"
print "exchange:", rdata.exchange
print "preference:", rdata.preference
if rdataset.rdtype == NS:
print "** NS-specific rdata **"
print "target:", rdata.target
if rdataset.rdtype == CNAME:
print "** CNAME-specific rdata **"
print "target:", rdata.target
if rdataset.rdtype == A:
print "** A-specific rdata **"
print "address:", rdata.address
except DNSException, e:
print e.__class__, e

When run against db.example.com, the code above produces this output.

Modifying a DNS zone file

Let's see how to add, delete and change records in our example.com zone file. dnspython offers several different ways to get to a record if you know its name or its type.

Here's how to modify the SOA record and increase its serial number, a very common operation for anybody who maintains DNS zones. I use the iterate_rdatas method of the Zone class, which is handy in this case, since we know that the rdataset actually contains one rdata of type SOA:
   
for (name, ttl, rdata) in zone.iterate_rdatas(SOA):
serial = rdata.serial
new_serial = serial + 1
print "Changing SOA serial from %d to %d" %(serial, new_serial)
rdata.serial = new_serial


Here's how to delete a record by its name. I use the delete_node method of the Zone class:

node_delete = "www2"
print "Deleting node", node_delete
zone.delete_node(node_delete)
Here's how to change attributes of existing records. I use the find_rdataset method of the Zone class, which returns a rdataset containing the records I want to change. In the first section of the following code, I'm changing the IP address of 'mail', and in the second section I'm changing the TTL for all the NS records corresponding to the zone origin '@':

A_change = "mail"
new_IP = "192.168.2.100"
print "Changing A record for", A_change, "to", new_IP
rdataset = zone.find_rdataset(A_change, rdtype=A)
for rdata in rdataset:
rdata.address = new_IP

rdataset = zone.find_rdataset("@", rdtype=NS)
new_ttl = rdataset.ttl / 2
print "Changing TTL for NS records to", new_ttl
rdataset.ttl = new_ttl

Here's how to add records to the zone file. The find_rdataset method can be used in this case too, with the create parameter set to True, in which case it creates a new rdataset if it doesn't already exist. Individual rdata objects are then created by instantiating their corresponding classes with the correct parameters -- such as rdata = dns.rdtypes.IN.A.A(IN, A, address="192.168.10.30").

I show here how to add records of type A, CNAME, NS and MX:
  A_add = "www3"
print "Adding record of type A:", A_add
rdataset = zone.find_rdataset(A_add, rdtype=A, create=True)
rdata = dns.rdtypes.IN.A.A(IN, A, address="192.168.10.30")
rdataset.add(rdata, ttl=86400)

CNAME_add = "www3_alias"
target = dns.name.Name(("www3",))
print "Adding record of type CNAME:", CNAME_add
rdataset = zone.find_rdataset(CNAME_add, rdtype=CNAME, create=True)
rdata = dns.rdtypes.ANY.CNAME.CNAME(IN, CNAME, target)
rdataset.add(rdata, ttl=86400)

A_add = "ns3"
print "Adding record of type A:", A_add
rdataset = zone.find_rdataset(A_add, rdtype=A, create=True)
rdata = dns.rdtypes.IN.A.A(IN, A, address="192.168.1.30")
rdataset.add(rdata, ttl=86400)

NS_add = "@"
target = dns.name.Name(("ns3",))
print "Adding record of type NS:", NS_add
rdataset = zone.find_rdataset(NS_add, rdtype=NS, create=True)
rdata = dns.rdtypes.ANY.NS.NS(IN, NS, target)
rdataset.add(rdata, ttl=86400)

A_add = "mail3"
print "Adding record of type A:", A_add
rdataset = zone.find_rdataset(A_add, rdtype=A, create=True)
rdata = dns.rdtypes.IN.A.A(IN, A, address="192.168.2.30")
rdataset.add(rdata, ttl=86400)

MX_add = "@"
exchange = dns.name.Name(("mail3",))
preference = 30
print "Adding record of type MX:", MX_add
rdataset = zone.find_rdataset(MX_add, rdtype=MX, create=True)
rdata = dns.rdtypes.ANY.MX.MX(IN, MX, preference, exchange)
rdataset.add(rdata, ttl=86400)

Finally, after modifying the zone file via the zone object, it's time to write it back to disk. This is easily accomplished with dnspython via the to_file method. I chose to write the modified zone to a new file, so that I have my original zone available for other tests:

new_zone_file = "new.db.%s" % domain
print "Writing modified zone to file %s" % new_zone_file
zone.to_file(new_zone_file)

The new zone file looks something like this (note that all names have been relativized from the origin):

@ 36000 IN SOA ns1 hostmaster 2005081202 28800 1800 2592000 86400
@ 43200 IN NS ns1
@ 43200 IN NS ns2
@ 43200 IN NS ns3
@ 86400 IN MX 10 mail
@ 86400 IN MX 20 mail2
@ 86400 IN MX 30 mail3
@ 86400 IN A 192.168.10.10
ftp 86400 IN CNAME @
mail 86400 IN A 192.168.2.100
mail2 86400 IN A 192.168.2.20
mail3 86400 IN A 192.168.2.30
ns1 86400 IN A 192.168.1.10
ns2 86400 IN A 192.168.1.20
ns3 86400 IN A 192.168.1.30
webmail 86400 IN CNAME @
www 86400 IN CNAME @
www3 86400 IN A 192.168.10.30
www3_alias 86400 IN CNAME www3

Although it looks much different from the original db.example.com file, this file is also a valid DNS zone -- I tested it by having my DNS server load it.

Obtaining a DNS zone via a zone transfer

This is also easily done in dnspython via the from_xfr function of the zone module. Here's how to do a zone transfer for dnspython.org, trying all the name servers for that domain one by one:

import dns.resolver
import dns.query
import dns.zone
from dns.exception import DNSException
from dns.rdataclass import *
from dns.rdatatype import *

domain = "dnspython.org"
print "Getting NS records for", domain
answers = dns.resolver.query(domain, 'NS')
ns = []
for rdata in answers:
n = str(rdata)
print "Found name server:", n
ns.append(n)

for n in ns:
print "\nTrying a zone transfer for %s from name server %s" % (domain, n)
try:
zone = dns.zone.from_xfr(dns.query.xfr(n, domain))
except DNSException, e:
print e.__class__, e


Once we obtain the zone object, we can then manipulate it in exactly the same way as when we obtained it from a file.

Various ways to iterate through DNS records

Here are some other snippets of code that show how to iterate through records of different types assuming we retrieved a zone object from a file or via a zone transfer:

print "\nALL 'IN' RECORDS EXCEPT 'SOA' and 'TXT':"
for name, node in zone.nodes.items():
rdatasets = node.rdatasets
for rdataset in rdatasets:
if rdataset.rdclass != IN or rdataset.rdtype in [SOA, TXT]:
continue
print name, rdataset

print "\nGET_RDATASET('A'):"
for name, node in zone.nodes.items():
rdataset = node.get_rdataset(rdclass=IN, rdtype=A)
if not rdataset:
continue
for rdataset in rdataset:
print name, rdataset

print "\nITERATE_RDATAS('A'):"
for (name, ttl, rdata) in zone.iterate_rdatas('A'):
print name, ttl, rdata

print "\nITERATE_RDATAS('MX'):"
for (name, ttl, rdata) in zone.iterate_rdatas('MX'):
print name, ttl, rdata

print "\nITERATE_RDATAS('CNAME'):"
for (name, ttl, rdata) in zone.iterate_rdatas('CNAME'):
print name, ttl, rdata
You can find the code referenced in this post in these 2 modules: zonemgmt.py and zone_transfer.py.

Monday, August 08, 2005

Agile documentation in the Django project

A while ago I wrote a post called "Agile documentation with doctest and epydoc". The main idea was to use unit tests as "executable documentation"; I showed in particular how combining doctest-based unit tests with a documentation system such as epydoc can result in up-to-date documentation that is synchronized with the code. This type of documentation not only shows the various modules, classes, methods, function, variables exposed by the code, but -- more importantly -- it also provides examples of how the code API gets used in "real life" via the unit tests.

I'm happy to see the Django team take a similar approach in their project. They announced on the project blog that API usage examples for Django models are available and are automatically generated from the doctest-based unit tests written for the model functionality. For example, a test module such as tests/testapp/models/basic.py gets automatically rendered into the 'Bare-bones model' API usage page. The basic.py file contains almost exclusively doctests in the form of a string called API_TESTS. The rest of the file contains some simple markers that are interpreted into HTML headers and such. Nothing fancy, but the result is striking.

I wish more projects would adopt this style of automatically generating documentation for their APIs from their unit test code. It can only help speed up their adoption. As an example, I wish the dnspython project had more examples of how to use the API it offers. That project does have epydoc-generated documentation, but if it also showed how the API actually gets used (via unit tests preferably), it would help its users avoid a lot of hair-pulling. Don't get me wrong, I think dnspython offers an incredibly useful API and I intend to post about some of my experiences using it, but it does require you to dig and sweat in order to uncover all its intricacies.

Anyway, kudos to the Django team for getting stuff right.

Monday, August 01, 2005

White-box vs. black-box testing

As I mentioned in my previous post, there's an ongoing discussion on the agile-testing mailing list on the merits of white-box vs. black-box testing. I had a lively exchange of opinions on this theme with Ron Jeffries. If you read my "Quick black-box testing example" post, you'll see the example of an application under test posted by Ron, as well as a list of back-box test activities and scenarios that I posted in reply. Ron questioned most of these black-box test scenarios, on the grounds that they provide little value to the overall testing process. In fact, I came away with the conclusion that Ron values black-box testing very little. He is of the opinion that white-box testing in the form of TDD is pretty much sufficient for the application to be rock-solid and as much bug-free as any piece of software can hope to be.

I never had the chance to work on an agile team, so I can't really tell if Ron's assertion is true or not. But my personal opinion is that there is no way developers doing TDD can catch several classes of bugs that are outside of their code-only realm. I'm thinking most of all about the various quality criteria categories, also known as 'ilities', popularized by James Bach. Here are some of them: usability, installability, compatibility, supportability, maintainability, portability, localizability. All these are qualities that are very hard to test in a white-box way. They all involve interactions with the operating system, with the hardware, with the other applications running on the machine hosting the AUT. To this list I would add performance/stress/load testing, security testing, error recoverability testing. I don't see how you can properly test all these things if you don't do black-box testing in addition to white-box type testing.

In fact, there's an important psychological distinction between developers doing TDD and 'traditional' testers doing mostly black-box testing. A developer thinks "This is my code. It works fine. In fact, I'm really proud of it.", while a tester is more likely to think "This code has some really nasty bugs. Let me discover them before our customer does." These two approaches are complementary. You can't perform just one at the expense of the other, or else your overall code quality will suffer. You need to build code with pride before you try to break it in various devious ways.

Here's one more argument from Ron as to why white-box testing is more valuable than black-box testing:

To try to simplify: the search method in question has been augmented with an integer "hint" that is used to say where in the large table we should start our search. The idea is that by giving a hint, it might speed up the search, but the search must always work even if the hint is bad.

The question I was asking was how we would test the hinting aspect.

I expect questions to arise such as those Michael Bolton would suggest, including perhaps:

What if the hint is negative?
What if the hint is after the match?
What if the hint is bigger than the size of the table?
What if integers are actually made of cheese?
What if there are more records in the table than a 32-bit int?

Then, I propose to display the code, which will include, at the front, some lines like this:

if (hint < 1) hint = 0;
if (hint > table.size) hint = 0;

Then, I propose to point out that if we know that code is there, there are a couple of tests we can save. Therefore white box testing can help make testing more efficient, QED.

My counter-argument was this: what if you mistakenly build a new release of your software out of some old revision of the source code, a revision which doesn't contain the first 2 lines of the search method? Presumably the old version of the code was TDD-ed, but since the 2 lines weren't there, we didn't have unit tests for them either. So if you didn't have black-box tests exercising those values of the hint argument, you'd let an important bug escape out in the wild. I don't think it's that expensive to create automated tests that verify the behavior of the search method with various well-chosen values of the hint argument. Having such a test harness in place goes a long way in protecting against admittedly weird situations such as the 'old code revision' I described.

In fact, as Amir Kolsky remarked on the agile-testing list, TDD itself can be seen as black-box testing, since when we unit test some functionality, we usually test the behavior of that piece of code and not its implementation, thus we're not really doing white-box testing. To this, Ron Jeffries and Ilja Preuss replied that in TDD, you write the next test with an eye on the existing code. In fact, you write the next test so that the next piece of functionality for the existing code fails. Then you make it pass, and so on. So you're really looking at both the internal implementation of the code and at its interfaces, as exposed in your unit tests. At this point, it seems to me that we're splitting hairs. Maybe we should talk about developer/code producer testing vs. non-developer/code consumer testing. In fact, I just read this morning a very nice blog post from Jonathan Kohl on a similar topic: "Testing an application in layers". Jonathan talks about manual vs. automated testing (another hotly debated topic on the agile-testing mailing list), but many of the ideas in his post can be applied to the white-box vs. black-box discussion.

Thursday, July 28, 2005

Quick black box testing example

There's an ongoing debate on the agile-testing mailing list on whether it's better to have a 'black box' or a 'white box' view into the system under test. Some are of the opinion that black boxes are easier to test, while others (Ron Jeffries in particular) say that one would like to 'open up' one's boxes, especially in an agile environment. I suspect that the answer, as always, is somewhere in the middle -- both white-box and black-box testing are critical and valuable in their own right.

I think that it's in combining both types of tests that developers and testers will find the confidence that the software under test is stable and relatively free of bugs. Developers do white-box testing via unit tests, while testers do mostly black-box testing (or maybe gray-box, since they usually do have some insight into the inner workings of the application) via functional, integration and system testing. Let's not forget load/performance/stress testing too...They too can be viewed as white-box (mostly in the case of performance testing) vs. black-box (load/stress testing), as I wrote in a previous post.

I want to include in this post my answer to a little example posted by Ron Jeffries. Here's what he wrote:

Let's explore a simple example. Suppose we have an application that includes an interface (method) whose purpose is to find a "matching" record in a collection, if one exists. If none exists, the method is to return null.

The collection is large. Some users of this method have partial knowledge of the collection's order, so that they know that the record they want, if it is in there at all, occurs at or after some integer index in the collection.

So the method accepts a value, let's say a string /find/, to match the record on, and an integer /hint/, to be used as a hint to start the search. The first record in the table is numbered zero. The largest meaningful /hint/ value is therefore N-1, where N is the number of records in the table.

We want the search to always find a record if one exists, so that if /hint/ is wrong, but /find/ is in some record, we must still return a matching record, not null.

Now then. Assuming a black box, what questions do we want to ask, what tests do we want to write, against our method

public record search(string find, int hint)?

And here's my answer:

I'll take a quick stab at it. Here's what I'd start by doing (emphasis on start):

1. Generate various data sets to run the 'search' method against.

1a. Vary the number of items in the collection: create collections with 0, 1, 10, 100, 1000, 10000, 100000, 1 million items for starters; it may be the case that we hit an operating system limit at some point, for example if the items are files in the same directory (ever done an ls only to get back a message like "too many arguments"?)

1b. For each collection in 1a., generate several orderings: increasing order, decreasing order, random, maybe some other statistical distributions.

1c. Vary the length of the names of the items in the collection: create collections with 0, 1, 10, 100, 1000 items, where the names of the items are generated randomly with lengths between 1 and 1000 (arbitrary limit, which may change as we progress testing).

1d. Generate item names with 'weird' characters (especially /, \, :, ; -- since they tend to be used as separators by the OS).

1e. Generate item names that are Unicode strings.

2. Run (and time) the 'search' method against the various collections generated in 1. Make sure you cover cases such as:

2a. The item we search for is not in the collection: verify that the search method returns Null.

2b. The item we search for is in position p, where p can be 0, N/2, N-1, N.

2c. For each case in 2b, specify a hint of 0, p-1, p, p+1, N-1: verify that in all combinations of 2b and 2c, the search method returns the item in position p.

2d. Investigate the effect of item naming on the search. Does the search method work correctly when item names keep getting longer? When the item names contain 'weird' or Unicode characters?

2e. Graph the running time of the search method against collection size, when the item is or is not in the collection (so you generate 2 graphs). See if there is any anomaly.

2f. Run the tests in 2a-2d in a loop, to see if the search method produces a memory leak.

2g. Monitor various OS parameters (via top, vmstat, Windows PerfMon) to see how well-behaved the search functionality is in regards to the resources on that machine.

2h. See how the search method behaves when other resource-intensive processes are running on that machine (CPU-, disk-, memory-, network- intensive).

If the collection of records is kept in a database, then I can imagine a host of other stuff to test that is database-related. Same if the collection is retrieved over the network.

As I said, this is just an initial stab at testing the search method. I'm sure people can come up with many more things to test. But I think this provides a pretty solid base and a pretty good automated test suite for the AUT.

I can think of many more tests that should be run if the search application talks to a database, or if it retrieves the search results via a Web service for example. I guess this all shows that a tester's life is not easy :-) -- but this is all exciting stuff at the same time!

Sunday, July 24, 2005

Django cheat sheet

Courtesy of James: Django cheat sheet. I went trough the first 2 parts of the Django tutorial and I have to say I'm very impressed. Can't wait to give it a try on a real Web application.

Friday, July 22, 2005

Slides from 'py library overview' presentation

I presented an overview of the py library last night at our SoCal Piggies meeting. Although I didn't cover all the tools in the py library, I hope I managed to heighten the interest in this very useful collection of modules. You can find the slides here. Kudos again to Holger Krekel and Armin Rigo, the main guys behind the py lib.

And while we're on this subject, let's make py.test the official unit test framework for Django!!! (see the open ticket on this topic)

Friday, July 15, 2005

Installing Python 2.4.1 and cx_Oracle on AIX

I just went through the pain of getting the cx_Oracle module to work on an AIX 5.1 server running Oracle 9i, so I thought I'd jot down what I did, for future reference.

First of all, I had ORACLE_HOME set to /oracle/OraHome1.

1. Downloaded the rpm.rte package from the AIX Toolbox Download site.
2. Installed rpm.rte via smit.
3. Downloaded (from the same AIX Toolbox Download site) and installed the following RPM packages, in this order:
rpm -hi gcc-3.3.2-5.aix5.1.ppc.rpm
rpm -hi libgcc-3.3.2-5.aix5.1.ppc.rpm
rpm -hi libstdcplusplus-3.3.2-5.aix5.1.ppc.rpm
rpm -hi libstdcplusplus-devel-3.3.2-5.aix5.1.ppc.rpm
rpm -hi gcc-cplusplus-3.3.2-5.aix5.1.ppc.rpm
4. Made a symlink from gcc to cc_r, since many configuration scripts find cc_r as the compiler of choice on AIX, and I did not have it on my server.
ln -s /usr/bin/gcc /usr/bin/cc_r
5. Downloaded Python-2.4.1 from python.org.
6. Installed Python-2.4.1 (note that the vanilla ./configure failed, so I needed to run it with --disable-ipv6):
gunzip Python-2.4.1.tgz
tar xvf Python-2.4.1.tar
cd Python-2.4.1
./configure --disable-ipv6
make
make install
7. Downloaded cx_Oracle-4.1 from sourceforge.net.
8. Installed cx_Oracle-4.1 (note that I indicated the full path to python, since there was another older python version on that AIX server):
bash-2.05a# /usr/local/bin/python setup.py install
running install
running build
running build_ext
building 'cx_Oracle' extension
creating build
creating build/temp.aix-5.1-2.4
cc_r -pthread -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I/oracle/OraHome1/rdbms/demo -I/oracle/OraHome1/rdbms/public -I/oracle/OraHome1/network/public -I/usr/local/include/python2.4 -c cx_Oracle.c -o build/temp.aix-5.1-2.4/cx_Oracle.o -DBUILD_TIME="July 15, 2005 14:49:28"
In file included from /oracle/OraHome1/rdbms/demo/oci.h:2138,
from cx_Oracle.c:9:
/oracle/OraHome1/rdbms/demo/oci1.h:148: warning: function declaration isn't a prototype
In file included from /oracle/OraHome1/rdbms/demo/ociap.h:190,
from /oracle/OraHome1/rdbms/demo/oci.h:2163,
from cx_Oracle.c:9:
/oracle/OraHome1/rdbms/public/nzt.h:667: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2655: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2664: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2674: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2683: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2692: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2701: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2709: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/public/nzt.h:2719: warning: function declaration isn't a prototype
In file included from /oracle/OraHome1/rdbms/demo/oci.h:2163,
from cx_Oracle.c:9:
/oracle/OraHome1/rdbms/demo/ociap.h:6888: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/demo/ociap.h:9790: warning: function declaration isn't a prototype
/oracle/OraHome1/rdbms/demo/ociap.h:9796: warning: function declaration isn't a prototype
In file included from Variable.c:93,
from Cursor.c:211,
from Connection.c:303,
from SessionPool.c:132,
from cx_Oracle.c:73:
DateTimeVar.c: In function `DateTimeVar_SetValue':
DateTimeVar.c:81: warning: unused variable `status'
creating build/lib.aix-5.1-2.4
/usr/local/lib/python2.4/config/ld_so_aix cc_r -pthread -bI:/usr/local/lib/python2.4/config/python.exp build/temp.aix-5.1-2.4/cx_Oracle.o -L/oracle/OraHome1/lib -lclntsh -o build/lib.aix-5.1-2.4/cx_Oracle.so -s
ld: 0711-317 ERROR: Undefined symbol: .OCINumberFromInt
ld: 0711-317 ERROR: Undefined symbol: .OCINumberFromReal
ld: 0711-317 ERROR: Undefined symbol: .OCINumberFromText
ld: 0711-317 ERROR: Undefined symbol: .OCINumberToReal
ld: 0711-317 ERROR: Undefined symbol: .OCINumberToText
ld: 0711-317 ERROR: Undefined symbol: .OCINumberToInt
ld: 0711-317 ERROR: Undefined symbol: .OCIParamGet
ld: 0711-317 ERROR: Undefined symbol: .OCIDescriptorFree
ld: 0711-317 ERROR: Undefined symbol: .OCIAttrGet
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtExecute
ld: 0711-317 ERROR: Undefined symbol: .OCISessionGet
ld: 0711-317 ERROR: Undefined symbol: .OCIServerDetach
ld: 0711-317 ERROR: Undefined symbol: .OCITransRollback
ld: 0711-317 ERROR: Undefined symbol: .OCISessionEnd
ld: 0711-317 ERROR: Undefined symbol: .OCISessionRelease
ld: 0711-317 ERROR: Undefined symbol: .OCIHandleFree
ld: 0711-317 ERROR: Undefined symbol: .OCIHandleAlloc
ld: 0711-317 ERROR: Undefined symbol: .OCIAttrSet
ld: 0711-317 ERROR: Undefined symbol: .OCITransStart
ld: 0711-317 ERROR: Undefined symbol: .OCISessionPoolCreate
ld: 0711-317 ERROR: Undefined symbol: .OCIErrorGet
ld: 0711-317 ERROR: Undefined symbol: .OCIEnvCreate
ld: 0711-317 ERROR: Undefined symbol: .OCINlsNumericInfoGet
ld: 0711-317 ERROR: Undefined symbol: .OCISessionPoolDestroy
ld: 0711-317 ERROR: Undefined symbol: .OCITransCommit
ld: 0711-317 ERROR: Undefined symbol: .OCITransPrepare
ld: 0711-317 ERROR: Undefined symbol: .OCIBreak
ld: 0711-317 ERROR: Undefined symbol: .OCIUserCallbackRegister
ld: 0711-317 ERROR: Undefined symbol: .OCIUserCallbackGet
ld: 0711-317 ERROR: Undefined symbol: .OCIServerAttach
ld: 0711-317 ERROR: Undefined symbol: .OCISessionBegin
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtRelease
ld: 0711-317 ERROR: Undefined symbol: .OCIDescriptorAlloc
ld: 0711-317 ERROR: Undefined symbol: .OCIDateTimeConstruct
ld: 0711-317 ERROR: Undefined symbol: .OCIDateTimeCheck
ld: 0711-317 ERROR: Undefined symbol: .OCIDateTimeGetDate
ld: 0711-317 ERROR: Undefined symbol: .OCIDateTimeGetTime
ld: 0711-317 ERROR: Undefined symbol: .OCILobGetLength
ld: 0711-317 ERROR: Undefined symbol: .OCILobWrite
ld: 0711-317 ERROR: Undefined symbol: .OCILobTrim
ld: 0711-317 ERROR: Undefined symbol: .OCILobRead
ld: 0711-317 ERROR: Undefined symbol: .OCILobFreeTemporary
ld: 0711-317 ERROR: Undefined symbol: .OCILobCreateTemporary
ld: 0711-317 ERROR: Undefined symbol: .OCIDefineByPos
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtGetBindInfo
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtPrepare2
ld: 0711-317 ERROR: Undefined symbol: .OCIStmtFetch
ld: 0711-317 ERROR: Undefined symbol: .OCIBindByName
ld: 0711-317 ERROR: Undefined symbol: .OCIBindByPos
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information.
collect2: ld returned 8 exit status
running install_lib
At this point, I did a lot of Google searches to find out why the loader emits these errors. I finally found the solution: Oracle 9i installs the 64-bit libraries in $ORACLE_HOME/lib and the 32-bit libraries in $ORACLE_HOME/lib32. Since setup.py is looking by default in $ORACLE_HOME/lib (via -L/oracle/OraHome1/lib), it finds the 64-bit libraries and it fails with the above errors. The quick hack I found was to manually re-run the last command that failed and specify -L/oracle/OraHome1/lib32 instead of -L/oracle/OraHome1/lib (I think the same effect can be achieved via environment variables such as LIBPATH).
bash-2.05a# /usr/local/lib/python2.4/config/ld_so_aix cc_r -pthread -bI:/usr/local/lib/python2.4/config/python.exp build/temp.aix-5.1-2.4/cx_Oracle.o -L/oracle/OraHome1/lib32 -lclntsh -o build/lib.aix-5.1-2.4/cx_Oracle.so -s
Then I re-ran setup.py in order to copy the shared library to the Ptyhon site-packages directory:

bash-2.05a# /usr/local/bin/python setup.py install
running install
running build
running build_ext
running install_lib
copying build/lib.aix-5.1-2.4/cx_Oracle.so -> /usr/local/lib/python2.4/site-packages


At this point I was able to import cx_Oracle at the Python prompt:

bash-2.05a# /usr/local/bin/python
Python 2.4.1 (#1, Jul 15 2005, 14:44:07)
[GCC 3.3.2] on aix5
Type "help", "copyright", "credits" or "license" for more information.
>>> import cx_Oracle
>>> dir(cx_Oracle)
['BINARY', 'BLOB', 'CLOB', 'CURSOR', 'Connection', 'Cursor', 'DATETIME', 'DataError', 'DatabaseError', 'Date', 'DateFromTicks', 'Error', 'FIXED_CHAR', 'FNCODE_BINDBYNAME', 'FNCODE_BINDBYPOS', 'FNCODE_DEFINEBYPOS', 'FNCODE_STMTEXECUTE', 'FNCODE_STMTFETCH', 'FNCODE_STMTPREPARE', 'IntegrityError', 'InterfaceError', 'InternalError', 'LOB', 'LONG_BINARY', 'LONG_STRING', 'NUMBER', 'NotSupportedError', 'OperationalError', 'ProgrammingError', 'ROWID', 'STRING', 'SYSDBA', 'SYSOPER', 'SessionPool', 'TIMESTAMP', 'Time', 'TimeFromTicks', 'Timestamp', 'TimestampFromTicks', 'UCBTYPE_ENTRY', 'UCBTYPE_EXIT', 'UCBTYPE_REPLACE', 'Warning', '__doc__', '__file__', '__name__', 'apilevel', 'buildtime', 'connect', 'makedsn', 'paramstyle', 'threadsafety', 'version']

Thursday, July 14, 2005

py lib gems: greenlets and py.xml

I've been experimenting with various tools in the py library lately, in preparation for a presentation I'll give to the SoCal Piggies group meeting this month. The py lib is choke-full of gems that are waiting to be discovered. In this post, I'll talk a little about greenlets, the creation of Armin Rigo. I'll also briefly mention py.xml.

Greenlets implement coroutines in Python. Coroutines can be seen as a generalization of generators, and it looks like the standard Python libray will support them in the future via 'enhanced generators' (see PEP 342). Coroutines allow you to exit a function by 'yielding' a value and switching to another function. The original function can then be re-entered, and it will continue execution from exactly where it left off.

The greenlet documentation offers some really eye-opening examples of how they can be used to implement generators for example. Another typical use case for greenlets/coroutines is turning asynchronous or event-based code into normal sequential control flow code -- the Python Desktop Server project has a good example of exactly such a transformation.

I've also been reading and looking at the code from Armin's EuroPython talk on greenlets. The talk itself must have been highly entertaining, since it is presented as a PyGame-based game. In one of the code examples I downloaded, I noticed yet another application of the asynchronous-to-sequential transformation, this time related to parsing XML data. In a few lines of code, Armin showed how to turn an asynchronous, Expat-based parsing mechanism into a generator that yields the XML elements one at a time. This approach combines the advantages of 1) using a stream oriented parser (and thus being able to process large amounts of XML data via handlers) with 2) using a generator to expose the XML parsing code in the shape of an iterator.

Here is Armin's code which I saved in a module called iterxml.py (I made a few minor modifications to make the code more general-purpose):

from py.magic import greenlet
import xml.parsers.expat

def send(arg):
greenlet.getcurrent().parent.switch(arg)

# 3 handler functions
def start_element(name, attrs):
send(('START', name, attrs))
def end_element(name):
send(('END', name))
def char_data(data):
data = data.strip()
if data:
send(('DATA', data))

def greenparse(xmldata):
p = xml.parsers.expat.ParserCreate()
p.StartElementHandler = start_element
p.EndElementHandler = end_element
p.CharacterDataHandler = char_data
p.Parse(xmldata, 1)

def iterxml(xmldata):
g = greenlet(greenparse)
data = g.switch(xmldata)
while data is not None:
yield data
data = g.switch()

Consumers of this code can pass a string containing an XML document to the iterxml function and then use a for loop to iterate through the elements yielded by the function, like this:

for data in iterxml(xmldata):
print data

When iterxml first executes, it instantiates a greenlet object and associates it with the greenparse function. Then it 'switches' into the greenlet and thus is calls that function with the given xmldata argument. There is nothing out of the ordinary in the greenparse function, which simply assigns the 3 handler functions to the xpat parser object, then calls its Parse method. However, the 3 handler functions all use greenlets via the send method, which sends the parsed data to the parent of the current greenlet. The parent in this case is the iterxml function, which yields the data at that point, then switches back into the greenparse function. The handler functions then get called again whenever a new XML element is encountered, and the switching back and forth continues until there is no more data to be parsed.

I've wanted for a while to check out the REST API offered by upcoming.org (which is a free alternative to meetup.com), so I used it in conjunction with the XML parsing stuff via greenlets.

Here's some code that uses the iterxml module to parse the response returned by upcoming.org when a request for searching the events in the L.A. metro area is sent to their server:

import sys, urllib
from iterxml import iterxml
import py

baseurl = "http://www.upcoming.org/services/rest/"
api_key = "YOUR_API_KEY_HERE"
metro_id = "1" # L.A. metro area
log = py.log.Producer("")

def get_venue_info(venue_id):
method = "venue.getInfo"
request = "%s?api_key=%s&method=%s&venue_id=%s" %
(baseurl, api_key, method, venue_id)
response = urllib.urlopen(request).read()
for data in iterxml(response):
if data[0] == 'START' and data[1] == 'venue':
attr = data[2]
venue_info = "%(name)s in %(city)s" % attr
break
return venue_info

def search_events(keywords):
method = "event.search"
request = "%s?api_key=%s&method=%s&metro_id=%s&search_text=%s" %
(baseurl, api_key, method, metro_id, keywords)
response = urllib.urlopen(request).read()
for data in iterxml(response):
if data[0] == 'START' and data[1] == 'event':
attr = data[2]
log("\n" + "-" * 80)
log.EVENT("%(name)s" % attr)
log.WHAT("%(description)s" % attr)
log.WHERE(get_venue_info(attr['venue_id']))
log.WHEN("%(start_date)s @ %(start_time)s" % attr)

if __name__ == "__main__":
if len(sys.argv) < 2:
print "Usage: %s " % sys.argv[0]
sys.exit(1)

keywords = "%%20".join(sys.argv[1:])
search_events(keywords)

An upcoming.org API key is automatically generated for you when you click here.

I tested the script by searching for Python-related events in L.A.:

./upcoming_search.py python

--------------------------------------------------------------------------------
[EVENT] SoCal Piggies July Meeting
[WHAT] Monthly meeting of the Southern California Python Interest Group.
[WHERE] USC in Los Angeles
[WHEN] 2005-07-26 @ 19:00:00

Note that I'm also using the py.log facilities I mentioned in a previous post. The only thing I needed to do was to instantiate a log object via log = py.log.Producer("") and then use it via keywords such as EVENT, WHAT, WHERE and WHEN. Since I didn't declare any log consumer, the default consumer is used, which prints its messages to stdout. Each message string is nicely prefixed by the corresponding keyword.

I'm still experimenting with greenlets, and I'm sure I'll use them in the future especially for event-based GUI code.

I want to also briefly touch on py.xml, a tool that allows you to generate XML and HTML documents almost painlessly from your Python code.

Here's the XML returned by the event.search method of upcoming.org when called with a search text of 'python':

<rsp stat="ok" version="1.0">
<event id="24868" name="SoCal Piggies July Meeting" description="Monthly meeting of the Southern California Python Interest Group." start_date="2005-07-26" end_date="0000-00-00" start_time="19:00:00" end_time="21:00:00" personal="0" selfpromotion="0" metro_id="1" venue_id="7425" user_id="14959" category_id="4" date_posted="2005-07-13" />
</rsp>


And here's how to generate the same XML output with py.xml:

import py

class ns(py.xml.Namespace):
"my custom xml namespace"

doc = ns.rsp(
ns.event(
id="24868",
name="SoCal Piggies July Meeting",
description="Monthly meeting of the Southern California Python Interest Group.",
start_date="2005-07-26",
end_date="0000-00-00",
start_time="19:00:00",
end_time="21:00:00",
personal="0",
selfpromotion="0",
metro_id="1",
venue_id="7425",
user_id="14959",
category_id="4",
date_posted="2005-07-13"),
stat="OK",
version="1.0",
)

print doc.unicode(indent=2).encode('utf8')


The code above prints:

<rsp stat="OK" version="1.0">
<event category_id="4" date_posted="2005-07-13" description="Monthly meeting of the Southern California Python Interest Group." end_date="0000-00-00" end_time="21:00:00" id="24868" metro_id="1" name="SoCal Piggies July Meeting" personal="0" selfpromotion="0" start_date="2005-07-26" start_time="19:00:00" user_id="14959" venue_id="7425"/></rsp>


As the py.xml documentation succintly puts it, positional arguments are child-tags and keyword-arguments are attributes. In addition, indentation is also available via the argument to the unicode method.

I intend to cover other tools from the py library in future posts. Stay tuned for discussions on py.execnet, little-known aspects of py.test and more!

Friday, July 01, 2005

Recommended reading: Jason Huggins's blog

I recently stumbled on Jason's blog via the Thoughtworks RSS feed aggregator. Jason is the creator of Selenium and a true Pythonista. His latest post on using CherryPy, SQLObject and Cheetah for creating a 'Ruby on Rails'-like application is very interesting and entertaining. Highly recommended! Hopefully the Subway guys will heed Jason's advice of focusing more on "ease of installation and fancy earth-shatteringly beautiful 10 minute setup movies" -- this is one area in which it's hard to beat the RoR guys, but let's at least try it!

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...