Calls of The Wild Exploring Procedural Abstraction in App Inventor
Calls of The Wild Exploring Procedural Abstraction in App Inventor
Abstract—One of the most important computational concepts code to be decomposed into reusable parts that can be written,
in any programming language is procedural abstraction. We tested, and debugged independently but called multiple times.
investigate the use of procedures in MIT App Inventor, a web- Indeed, one App Inventor textbook [4] introduces procedures
based blocks programming environment for creating Android
mobile apps. We explore how procedures are used “in the wild” in the context of the Don’t Repeat Yourself (DRY) mantra,
by examining two datasets of App Inventor projects: all projects a software engineering principle popularized in [5]. From a
of ten thousand randomly chosen users and all projects of all cognitive perspective, procedures break programs into smaller
prolific users (those users with 20 or more projects). chunks that are easier to think about, so even procedures that
Our data analysis indicates that procedural abstraction is a are called only once can help to make programs more un-
concept that is learned over time by some App Inventor users,
but it is used relatively infrequently, and features like parameters derstandable. Procedures also establish an abstraction barrier
and returning values are used even more rarely. Procedures that separates the high-level behavior of the procedures from
are most frequently called only once, indicating that they are its low-level implementation details, permitting clients to use
often used to organize code rather than to reuse it. Surprisingly, procedures based on their contracts without knowledge of their
10% of declared procedures are never called, suggesting that this implementations, allowing implementers to improve all calls
situation should be flagged by the environment.
by changing a single declaration, and supporting the notion of
I. I NTRODUCTION data abstraction [6].
Blocks programming languages are a popular way to lower In this paper, we study how App Inventor programmers use
barriers to programming for those with little or no pro- procedures by using a learning analytics approach based on
gramming experience as well as for casual programmers and two large datasets of App Inventor projects “in the wild”: all
even seasoned programmers using unfamiliar domain-specific projects of ten thousand randomly chosen users (whom we
languages [1]. For example, MIT App Inventor is a web- call “random users”) and all projects of users with 20 or more
based blocks programming environment for democratizing the projects (whom we call “prolific users”).
creation of apps for Android mobile devices [2] used by over This work makes several contributions to the study of App
5 million people to create over 20 million app projects.1 Inventor in particular and blocks programming in general:
The open-ended nature of blocks environments like App 1) We give insight into how App Inventor programmers
Inventor and Scratch (in which many projects are personally use procedures, a key computational concept essential
meaningful creations as opposed to more constrained programs for understanding the skill level of users and how they
specified as part of coordinated activities like courses) makes learn over time.
it challenging to investigate what users are learning when 2) We identify issues with procedure use that can be
they create a sequence of projects. The long-term goal of addressed by improvements in the App Inventor envi-
our research is to use learning analytics on large datasets ronment and educational materials.
of projects to identify common conceptual difficulties experi- 3) The methodology we develop for studying procedures
enced by App Inventor programmers and to improve the App can be used to study other key computational concepts
Inventor programming environment and associated educational in App Inventor and other blocks languages.
support materials to alleviate these difficulties [3]. II. A PP I NVENTOR P ROJECTS AND P ROCEDURES
One of the most important computational concepts in almost
every programming language is procedural abstraction—the We start with a concrete example: a MyMoleMash project
notion that a computational pattern can be captured in a (possi- that is a variant of a MoleMash tutorial program commonly
bly parameterized) declaration (such as a procedure, function, used as an example of a simple game and a first program
or method) and can then be used by calling the declared entity illustrating procedures (e.g., [4, Ch. 3]).
on actual argument values that are used to fill the “holes” spec- An App Inventor program is called a project. It can have
ified by the parameters. Procedural abstraction has numerous multiple screens, each of which is specified independently.
aspects. From a software engineering perspective, it allows Our example (and most App Inventor projects) have a single
screen. A screen consists of components that are chosen in
1 http://appinventor.mit.edu/ai2stats a drag-and-drop Designer window (Figure 1). Components
the MoleTimer.Timer and MoleSprite.Touched han- guished by name (e.g., Pascal’s nonfruitful “procedures” vs. fruitful “func-
tions”) or by return type (e.g., Java’s nonfruitful void methods vs. non-void
dlers. DimensionRandom is a procedure with two parame- fruitful methods). In many languages, such as JavaScript, Python, Scheme,
ters (canvasDimension and moleDimension) that returns Standard ML, etc., a single term (such as “function” or “procedure”) is used
a random number between 0 and (canvasDimension − for both kinds of declarations, and nonfruitful declarations are distinguished
from fruitful ones by either omitting an explicit return statement or
moleDimension); this is called twice in MoveMole to find returning a special or uninteresting “don’t care” value (e.g., Python’s None
random X and Y coords for MoleSprite. App Inventor value or Standard ML’s unit value).
80
Fig. 2. The Blocks Editor for MyMoleMash. The program has three event handlers (MoleTimer.Timer, MoleSprite.Touched, and
ResetButton.Clicked), the first two of which call the nonfruitful parameterless procedure MoveMole. The body of the MoveMole procedure declaration
contains two calls to the fruitful DimensionRandom procedure, which has two parameters.
81
Fig. 6. Proportion of users having n projects with at least one called
Fig. 4. The number of active blocks per project for both datasets. procedure.
Fig. 5. Proportion of projects that contain n procedure declarations (n > 0). Fig. 7. Project number when users first use procedures
This chart excludes the 85% of random projects and 82% of prolific projects
that contain no procedures.
users is flat as n ranges from 1 to 4, after which it drops in a
linear fashion.
V. A NALYSES I NVOLVING P ROCEDURES
Another analysis that distinguishes the datasets can be seen
A. Procedure Declarations in Fig. 7. This shows, for all users who eventually call a
Procedures aren’t commonly used in App Inventor: 26,428 procedure in at least one of their projects, the project number
(85%) of random user projects and 1,267,643 (82%) of prolific (ordered by creation time) in which they first use a procedure.
user projects do not contain any procedure declarations. Fig. 5 For the random users, 65% use a procedure early (within
shows the distribution of procedure declarations in the re- their first four projects), while this number is only 19% for
maining projects. The percentage of projects with n procedure prolific users. This suggests that there is a subset of the random
declarations decreases in a manner that is roughly 1/n. users who come to App Inventor with prior programming
User statistics involving procedure declarations differ experience, and so start using procedures in their projects
greatly between the two datasets. Only 1749 users (17.5%) right away. The much lower numbers for the prolific users
of random users have some project in which a procedure is suggest that either they come to App Inventor without knowing
declared, but 39,873 (86.1%) of prolific users have such a procedures (as might be the case for students taking a first
project; this validates our expectation that the two datasets programming course using App Inventor), or, even if they
would differ in this regard. The large differences between the do have prior programming experience, they start their App
(minimum, median) numbers of projects for random users (0, Inventor experience with simple projects of the sort used in
1) and for prolific users (20, 27) means that prolific users have introductory tutorials (which do not include procedures). In a
many more opportunities to use procedures. Fig. 6 shows a typical App Inventor course, a project similar to MoleMash
breakdown of the proportion of users in each dataset that have (which uses zero-parameter nonfruitful procedures) would be
n projects with at least one called procedure. As n increases, introduced after simple apps, perhaps explaining the jump
this proportion drops fast for random users, but for prolific between projects 5 and 20 for prolific users.
82
(a) Distribution of users by the proportion of their projects with at
least one called procedure This excludes users for whom the ratio is
exactly 0.
(b) Distribution of users by the ratio of (1) the total number of called
procedures they declared (in all projects) to (2) their total number of
projects. This excludes users for whom the ratio is exactly 0. The Fig. 9. Proportion of declared procedures that are called a given number of
rightmost whisker is the 95th percentile. The max random user ratio times.
is 109 and the max prolific user ratio is 39.27 (omitted here).
Fig. 8. The user population for (a) and (b) is all users with at least one
project in which there is at least one called procedure (1,522 random users
and 39,226 prolific users).
83
TABLE I
F REQUENCY OF PARAMETERS IN PROCEDURE DECLARATIONS .
84
levels off at about a quarter of the projects is worrisome. It Inventor currently puts various errors and warnings on
suggests that, after creating a substantial number of projects, problematic blocks, but uncalled procedure declarations
users either aren’t making projects complicated enough to currently carry no warning. They should, perhaps along
require procedures, or they’re failing to use procedures when with an easy way to create an associated call block.
they should be using them. • To highlight that procedures declarations need to have
We have also made similar charts for skill progression associated caller blocks, there should be additional ways
involving (1) procedures with nonzero parameters that are to create caller blocks for a procedure other than opening
called at least once and (2) fruitful procedures that are called at the procedure drawer. For example, hovering over a
least once. Space does not permit the inclusion of these charts, procedure declaration with a mouse could open a menu
but they are roughly similar in shape to those in Fig. 11 except option for creating a caller block in a way similar to
for the final level approached (about 10% for procedues with hovering over a variable declaration gives a menu for
nonzero parameters and 4 to 6% for fruitful procedures). These creating getter and setter blocks for that variable [17].
results bolster the conclusion that these two concepts are not • The procedure drawer could contain examples of decla-
learned well by App Inventor users. rations with at least one parameter. This would make it
more obvious that App Inventor procedures have param-
VI. D ISCUSSION
eters for those not familiar with using the gear icon to
A. Threats to Validity edit the parameters of a procedure.
Results about computational concepts from project datasets • When a user is copying a large block of code from an
will be most meaningful when the projects are original, i.e., event handler or procedure to another, the App Inventor
built from scratch by users based on their own ideas and system might suggest that the user encapsulate that code
current programming skills. However, it is likely that many into a procedure declaration, and could even automati-
projects in our datasets are unoriginal, i.e., they are created cally generate a candidate declaration.
by following online tutorials, doing exercises in a class, or • Software engineering approaches for automatically de-
trying out or making minor modifications to existing projects tecting opportunities for creating procedures to avoid
shared by others (e.g., via App Inventor’s gallery feature). code duplication [18], [19] could be adapted to App
A previous study of App Inventor estimated at least 16.4% Inventor programs to allow an option for automatically
of projects were based on tutorials, as determined by project refactoring the blocks on a screen by introducing (pos-
names [14]. It used a small tutorial set and didn’t consider non- sibly parameterized) procedures and replacing the dupli-
English versions of the project names, so this is most likely an cated code by calls to these procedures.
underestimate. As part of this paper, we determined that 22% • App Inventor would benefit from more tutorials involving
of procedure names in the random user dataset matched one of procedures (especially fruitful procedures and procedures
the procedure names used in tutorials at appinventor.mit.edu with parameters), as well as a help system that provides
and appinventor.org, suggesting that many procedures in our documentation/examples for blocks in context—e.g., how
datasets may be unoriginal, affecting our results. to declare and use procedure parameters and how to solve
Another source of unoriginality is that some users have the plumbing problem for fruitful procedure bodies.
many similar versions of a project, most likely created as
checkpoints. App Inventor does not provide a mechanism C. Classifying Users
for saving and restoring an earlier project version other than The App Inventor environment does not collect demo-
making a copy, nor did it have an undo capability until graphic data about users, nor does it “know” the role in
recently. So saving many versions of a large program is a which people are using it. Some users come to App Inventor
strategy to avoid losing work. For example, we discovered with significant prior programming experience, while many
that in Fig. 5, the bump at 8 procedures in the random user are programming newbies. Some users are students taking a
dataset was almost entirely due to one user who had 33 nearly semester-long class and will be engaged with App Inventor for
identical versions of a project with 8 procedure declarations. months; others are casual programmers working on their own
To understand what App Inventor users are learning and projects; yet others are just trying App Inventor out, perhaps
what misconceptions they have, we need to filter out unoriginal in the context of an activity like Code.org’s Hour of Code.
projects and focus on original ones. Our plan for doing this is A user’s skill level with procedures and other computational
sketched in [3]. It extends our previous work on determining concepts can help to classify them. Other user data, such as
project similarity by defining distance metrics between App their number of projects, the period in which they’re engaged
Inventor projects represented as feature vectors [16]. with App Inventor, and the overlap of the creation times of
their projects with the project of others, can help to identify
B. Improving App Inventor them as members of a coordinated activity, like a course or
Our analyses so far suggest several ways to improve App club [3]. Some prolific App Inventor users are teachers, which
Inventor with regard to procedures: can often be deduced from the fact that they appear to have
• Uncalled procedure declarations are surprisingly com- created numerous large projects around the same time when
mon, so they should be highlighted in some way. App they upload their students’ projects to grade them.
85
Automatically classifying users in terms of their role and R EFERENCES
expertise level could be used to customize the kinds of [1] D. Bau, J. Gray, C. Kelleher, J. S. Sheldon, and F. Turbak, “Learnable
suggestions, examples, documentation, etc. that are offered to programming: Blocks and beyond,” Communications of the ACM, 2017,
them in a more interactive version of App Inventor. Research to appear.
[2] D. Wolber, H. Abelson, and M. Friedman, “Democratizing computing
in intelligent tutoring systems has long used user modeling to with App Inventor,” GetMobile: Mobile Computing and Communica-
suggest activities for students based on their skill level [20]. tions, vol. 18, no. 4, pp. 53–58, Jan. 2015.
More recent work has indicated that enhancing these models [3] F. Turbak, E. Mustafaraj, M. Svanberg, and M. Dawson, “Work in
progress: Identifying and analyzing original projects in an open-ended
with student-specific parameters that take into account the blocks programming environment,” in Proceedings of the The 23rd In-
speed of learning improves the predicting power of the models ternational DMS Conference on Visual Languages and Sentient Systems
[21]. Furthermore, data-driven learning of student parameters (DMSVLSS 2017).
[4] D. Wolber, H. Abelson, E. Spertus, and L. Looney, App Inventor 2:
can also reduce the need for embedding significant domain Create your own Android Apps, 2nd ed. O’Reilly Media, Inc., 2014.
knowledge [22]. [5] A. Hunt and D. Thomas, The Pragmatic Programmer: From Journeyman
to Master. Addison-Wesley, 2000.
VII. C ONCLUSION AND F UTURE W ORK [6] H. Abelson, G. J. Sussman, and J. Sussman, Structure and Interpretation
of Computer Programs (2nd ed.). MIT Press, 1996.
Our preliminary exploratory data analysis of procedures in [7] F. Turbak, M. Sherman, F. Martin, D. Wolber, and S. C. Pokress,
App Inventor projects indicates that procedures are a concept “Events-first programming in App Inventor,” Journal of Computing
Sciences in Colleges, Apr. 2014.
that is learned over time, but they are used relatively infre- [8] E. Aivaloglou and F. Hermans, “How kids code and how we know:
quently, and features like parameters and returning values are An exploratory study on the Scratch repository,” in Proceedings of the
used even more rarely. Procedures are most frequently called 2016 ACM Conference on International Computing Education Research
(ICER ’16), 2016, pp. 53–61.
only once, indicating that they are often used to organize [9] C. Scaffidi and C. Chambers, “Skill progression demonstrated by users
code rather than to reuse it. Surprisingly, 10% of declared in the Scratch animation environment.” International Journal of Human-
procedures are never called, indicating conceptual confusions Computer Interaction, vol. 28, pp. 383–398, 2012.
[10] J. N. Matias, S. Dasgupta, and B. M. Hill, “Skill progression in Scratch
and suggesting that this situation should be flagged by the revisited,” in Proceedings of the 2016 CHI Conference on Human
environment. Factors in Computing Systems (CHI ’16), 2016, pp. 1486–1490.
With regard to procedures, a next step is to use a feature- [11] S. Yang, C. Domeniconi, M. Revelle, M. Sweeney, B. U. Gelman,
C. Beckley, and A. Johri, “Uncovering trajectories of informal learning
vector representation of projects (1) to filter out unoriginal in large online communities of creators,” in Proceedings of the Second
procedures and repeat the analysis from this paper to see (2015) ACM Conference on Learning @ Scale (L@S ’15), 2015, pp.
how this affects the results and (2) to approximate missed 131–140.
[12] K. Brennan and M. Resnick, “New frameworks for studying and assess-
opportunities for proceduralization in a project. ing the development of computational thinking,” in Annual Meeting of
We also plan to study other App Inventor features that the American Educational Research Association, Vancouver, CA, 2012.
support abstraction, such as lists, loops, and generic blocks. [13] B. Xie and H. Abelson, “Skill progression in MIT App Inventor,” in
IEEE Symposium on Visual Languages and Human-Centric Computing,
Preliminary investigations indicate these are also used rarely 2016, pp. 213–217.
and have associated misconceptions. We hypothesize that the [14] B. Xie, I. Shabir, and H. Abelson, “Measuring the usability and capabil-
concrete nature of blocks may encourage a kind of “abstrac- ity of App Inventor to create mobile applications,” in 3rd International
Workshop on Programming for Mobile and Touch, 2015, pp. 1–8.
tionless programming” in which abstraction mechanisms will [15] J. Okerlund and F. Turbak, “A preliminary analysis of App In-
be rarely used unless they are somehow taught explicitly, ventor blocks programs (showpiece/poster),” in IEEE Symposium on
possibly by interventions from the programming environment. Visual Languages and Human-Centric Computing (VL/HCC ’13),
Sep. 2013, abstract available at http://cs.wellesley.edu/∼tinkerblocks/
The similarity between our results and those in the ex- VLHCC13-abstract.pdf.
ploratory analysis of Scratch [8] (e.g., few projects with [16] E. Mustafaraj, F. A. Turbak, and M. Svanberg, “Identifying original
declared procedures, most procedures are parameterless, most projects in App Inventor,” in Proceedings of the Thirtieth International
Florida Artificial Intelligence Research Society Conference, FLAIRS
common number of times a procedure is called is one) sug- 2017., pp. 567–573.
gests further in-depth investigations involving multiple blocks [17] F. Turbak, D. Wolber, and P. Medlock-Walton, “The design of naming
languages. features in App Inventor 2,” in IEEE Symposium on Visual Languages
and Human-Centric Computing (VL/HCC ’14), Aug. 2014.
Finally, we imagine using the skill level exhibited with com- [18] R. Komondoor and S. Horwitz, “Eliminating duplication in source code
putational concepts like procedures to classify users and enable via procedure extraction,” Dept. of Computer Sciences, University of
customized user feedback from the programming environment. Wisconsin-Madison, Tech. Rep. 1461, 2002.
[19] T. J. Edler von Koch, B. Franke, P. Bhandarkar, and A. Dasgupta,
“Exploiting function similarity for code size reduction,” in Proceedings
ACKNOWLEDGMENTS of the 2014 SIGPLAN/SIGBED Conference on Languages, Compilers
This work was supported by the Wellesley College Science and Tools for Embedded Systems (LCTES ’14), 2014, pp. 85–94.
[20] A. T. Corbett and J. R. Anderson, “Knowledge tracing: Modeling the
Summer Research Program and an IBM Faculty Research acquisition of procedural knowledge,” User modeling and user-adapted
Fund for Science and Math. The App Inventor datasets were interaction, vol. 4, no. 4, pp. 253–278, 1994.
provided by the MIT team’s Jeff Schiller. Our analyses use a [21] M. V. Yudelson, K. R. Koedinger, and G. J. Gordon, “Individualized
Bayesian knowledge tracing models,” in International Conference on
Python project summarization program that builds upon earlier Artificial Intelligence in Education. Springer, 2013, pp. 171–180.
work by Benji Xie and Maja Svanberg. Maja’s work was [22] S. J. Lee, Y.-E. Liu, and Z. Popovic, “Learning individual behavior in an
supported by a Wellesley College Faculty Grants and by the educational game: a data-driven approach,” in Educational Data Mining
2014, 2014.
National Science Foundation under grant DUE-1226216.
86