ISSN: 2581-8651
Journal of Humanities and
Vol-7, Issue-2, Mar-Apr 2025
Education Development
https://dx.doi.org/10.22161/jhed.7.2.10
(JHED)
Peer-Reviewed Journal
Construction and Standardization of Psychology Aptitude
Test for Incoming College Psychology Students
Keisha Charisse O. Digon1, Arjay Y. Alvarado2
1Mental Health Unit, Corazon Locsin Montelibano Memorial Regional Hospital, Philippines
Email:
[email protected]2College of Arts and Sciences, Carlos Hilado Memorial State University, Philippines
Email: [email protected]
Received: 29 Feb 2025; Received in revised form: 27 Mar 2025; Accepted: 04 Apr 2025
©2025 The Author(s). Published by TheShillonga. This is an open-access article under the CC BY license
(https://creativecommons.org/licenses/by/4.0/)
Abstract— Psychology programs in higher education institutions rely on accurate assessment tools to
gauge the aptitude of incoming students effectively. However, existing standardized tests often fail to
address the unique skill sets and knowledge domains specific to psychology, emphasizing the necessity for
a tailored Psychology Aptitude Test. This study aims to bridge this gap by developing and standardizing a
Psychology Aptitude Test tailored for incoming college psychology students. Anchored on Item Response
Theory (IRT), the study endeavors to create a test that comprehensively evaluates students' preparedness
for the BS Psychology program. Objectives include assessing the Psychology Aptitude Test's validity, item
difficulty and discrimination indices, and reliability. Findings reveal a sound measurement with strong
content validity, balanced difficulty levels, and internal consistency. Recommendations for ongoing item
review, expanded validation studies, and predictive validity assessment are provided to optimize the test's
effectiveness in evaluating psychological aptitude within college environments.
Keywords — Psychology Aptitude Test, Item Response Theory, psychometrics, validity and reliability,
college readiness
I. INTRODUCTION Research in the field of psychological assessment has
Psychology, as an academic discipline and profession, primarily focused on the development and validation of
plays an important role in understanding human behavior, tests for specific constructs such as intelligence,
cognition, and emotions. With the increasing popularity of personality, and clinical disorders (Graham, 2016). While
psychology programs in colleges and universities, a these assessments provide valuable insights into individual
compelling need arises to ensure the effectiveness of differences, they do not fully capture the breadth of skills
assessment tools for incoming students. While and competencies relevant to success in psychology
standardized tests exist for various academic fields, education and practice. Furthermore, the existing literature
constructing an aptitude test specifically tailored for on aptitude testing in psychology predominantly revolves
psychology students remains an unmet challenge (Cohen around postgraduate or professional levels, overlooking
& Swerdlik, 2018). Existing standardized tests often lack the critical transition phase of incoming college students
specificity to the unique skill sets and knowledge domains (Kuncel et al., 2013). Consequently, there is a notable gap
required in psychology, potentially leading to inaccurate in the literature regarding the construction and
assessments of students' aptitude and preparedness standardization of a Psychology Aptitude Test explicitly
(Bridgeman & Wendler, 2016). This gap highlights the tailored for incoming college psychology students.
necessity for a Psychology Aptitude Test to Addressing this gap is essential for enhancing the accuracy
comprehensively evaluate incoming college psychology and validity of student assessments and ultimately
students' readiness and suitability for the demands of their promoting the quality of psychology education.
chosen field. Developing and implementing a Psychology Aptitude Test
holds significant benefits for multiple stakeholders within
https://theshillonga.com/index.php/jhed Page | 85
Digon and Alvarado Journal of Humanities and Education Development (JHED)
J. Humanities Educ. Dev.- 7(2)-2025
the academic and professional domains. The test results
may be utilized to make informed decisions regarding the Scope and Limitations
selection of incoming psychology students, ensuring that
The scope of this study encompassed the construction and
admitted individuals possess the necessary foundational
standardization of a Psychology Aptitude Test for
knowledge and skills for success in their academic
incoming college psychology students, including item
endeavors. Additionally, students themselves may benefit
development, validation procedures, pilot testing, and
from a more accurate assessment of their aptitude, guiding
psychometric analysis. This aimed to provide a
their academic and career aspirations and facilitating their
comprehensive assessment tool that accurately measures
personal and professional development in the field of
students' knowledge, skills, and abilities in Psychology.
psychology. Overall, the construction and standardization
of a Psychology Aptitude Test has the potential to enhance
the quality and rigor of psychology education, benefiting II. MATERIALS AND METHODS
students, educators, and the broader psychological
Research Design
community alike.
Instrumentation design, particularly in the context of
Objectives of the Study
developing an aptitude test, involves creating a set of
1. What is the validity of the Psychology Aptitude Test? questions and tasks that accurately assess the knowledge,
2. What is the difficulty index of each item of the skills, and abilities of individuals in a specific field. This
Psychology Aptitude Test? process includes designing questions that cover a wide
range of topics within the subject area, varying in
3. What is the discrimination index of each item of the
difficulty to effectively evaluate the candidates'
Psychology Aptitude Test?
proficiency. The design of an aptitude test for
4. What is the reliability of the Psychology Aptitude Test? instrumentation engineers aims to test their understanding
5. What is the test norm of the Psychology Aptitude Test? of concepts related to instrumentation, transducers, control
Framework systems, and other relevant areas (libguides.mit.edu, n.d.).
This study is anchored on the Item Response Theory (IRT) Furthermore, the design of an aptitude test for
proposed by Lord and Novick (1968). This provides a instrumentation engineers should ensure that the test is
strong framework for constructing and standardizing the valid, reliable, and free of bias. Validity ensures that the
Psychology Aptitude Test for incoming college test measures what it intends to measure, while reliability
psychology students. IRT models the relationship between ensures consistent results when administered multiple
an individual's abilities and how they respond to test items, times. Additionally, the design of an aptitude test should
yielding an advanced approach for creating test items. By be based on a conceptual framework that aligns with the
utilizing IRT principles, the researchers can ensure that the learning objectives and outcomes of the test.
items included in the Psychology Aptitude Test accurately Research Procedure
measure students' psychology-related abilities across This research was advanced to develop an aptitude test that
various difficulty levels. This ensures that the test provides aims to serve as a reliable instrument systematically
reliable and valid assessments, crucial for evaluating designed to assess various facets of students' cognitive
students' aptitude and readiness for psychology education abilities, analytical reasoning, and psychological
at the college level. knowledge. In doing so, the researchers followed this
In the context of this study, IRT guides the item procedure:
development process, allowing for the creation of test Firstly, the researchers initiated the study by conducting a
items that effectively discriminate between students with comprehensive review of relevant literature. Secondly,
different levels of aptitude in psychology. Furthermore, they developed test items based on the specifications
IRT provides guidance for the calibration of test items, outlined in the Introduction to Psychology course, a
ensuring that they accurately measure students' abilities fundamental component of the BS Psychology program.
while maintaining consistency and reliability across Thirdly, the validation of these items was undertaken by
different test administrations. By anchoring the study on five subject matter experts. Following the validation
IRT, the researchers aim to develop a Psychology Aptitude process, a request to carry out the study was submitted to
Test that provides precise and informative measurements the office of the Vice-President for Academic Affairs.
of incoming college psychology students' aptitude for the Upon approval, the researchers administered the
field. instrument to three-hundred sixty-eight applicants for the
https://theshillonga.com/index.php/jhed Page | 86
Digon and Alvarado Journal of Humanities and Education Development (JHED)
J. Humanities Educ. Dev.- 7(2)-2025
BS Psychology program at Carlos Hilado Memorial State and lower aptitudes effectively. This suggests that even the
University. Subsequently, the data collected from the pilot most challenging items in the test play an important role in
test was compiled and analyzed utilizing appropriate identifying differences in ability. However, items flagged
statistical methods. for revision or removal indicate areas where the test could
Statistical Analysis be improved to maintain its accuracy and effectiveness in
assessing student aptitude.
For problem number one, which aims to determine the
validity of the Psychology Aptitude Test, the Content To enhance the overall reliability and precision of the test,
Validity Ratio developed by Lawshe was used. only the items with the highest discrimination indices—
those classified as "Very Good"—were retained. These
For problem number two, which aims to determine the
high-performing items were particularly effective at
difficulty index of each item of the Psychology Aptitude
distinguishing between students with different levels of
Test, Crocker's (1986) method was followed
aptitude and thus contributed significantly to the test's
For problem number three, which aims to determine the accuracy. As a result, the original 60-item test was reduced
discrimination index of each item of the Psychology to 30 carefully selected items.
Aptitude Test, Ebel's (1979) method was followed.
The discrimination index was calculated by comparing
For problem number four, which aims to determine the how often students in the top 27% (higher scorers) and
reliability of the Psychology Aptitude Test, KR20 was bottom 27% (lower scorers) answered each item correctly.
used. This method helps ensure that the test accurately identifies
variations in students' abilities, making it a reliable tool for
assessing psychological aptitude.
III. RESULTS AND DISCUSSION
Table 1. Difficulty and Discrimination Indices
Statement of the Problem No. 1
Item Difficulty Index Discrimination Index
In the first phase, the researchers developed a 70-item test
No.
and were subjected to content validation. Five subject
matter experts who are psychology teachers validated the 1 0.49 Moderate 0.26 To be revised
test by following Lawshe's method (1975). After the 2 0.75 Easy -0.08 To be
validation process, only sixty (60) items yield at least a .99 discarded
content validity ratio (required CVR to be retained using
3 0.26 Difficult 0.38 Very Good
five validators).
Item
Statement of the Problem No. 2 and 3
4 0.43 Moderate 0.29 To be revised
Table 1 provides a detailed breakdown of how individual
5 0.15 Difficult 0.51 Very Good
items performed in the psychology aptitude test,
Item
specifically looking at their difficulty and ability to
differentiate between high- and low-scoring students. The 6 0.68 Moderate 0.08 To be
distribution of difficulty levels was well-balanced, with discarded
most items categorized as moderately difficult, aligning 7 0.63 Moderate 0.14 To be
with previous studies (Candiasa et al., 2018; Raza et al., discarded
2022; Gupta, 2010; Propp, 2005; Lloyd, 1991). However, 8 0.80 Easy -0.03 To be
similar research shows that some items required revision discarded
due to moderate difficulty but weak discrimination. In
contrast, others were removed entirely because they failed 9 0.38 Moderate 0.40 Very Good
to effectively distinguish between students, particularly Item
those with negative discrimination indices. This pattern 10 0.82 Easy -0.09 To be
suggests that the test was designed to cover a range of discarded
difficulty levels, ensuring a well-rounded evaluation of 11 0.75 Easy -0.03 To be
students' aptitude. discarded
Looking at the discrimination index, the results varied. 12 0.43 Moderate 0.29 To be revised
Certain items demonstrated strong discrimination, echoing
13 0.86 Very Easy -0.18 To be
findings from Lim et al. (2015). For instance, while
discarded
categorized as very difficult, items like 32, 36, and 55
were still able to differentiate between students with higher
https://theshillonga.com/index.php/jhed Page | 87
Digon and Alvarado Journal of Humanities and Education Development (JHED)
J. Humanities Educ. Dev.- 7(2)-2025
14 0.30 Moderate 0.41 Very Good 39 0.40 Moderate 0.32 Very Good
Item Item
15 0.15 Difficult 0.55 Very Good 40 0.64 Moderate 0.18 To be revised
Item 41 0.27 Difficult 0.44 Very Good
16 0.69 Moderate 0.04 To be Item
discarded 42 0.21 Difficult 0.49 Very Good
17 0.58 Moderate 0.19 To be revised Item
18 0.76 Easy 0.03 To be 43 0.49 Moderate 0.23 To be revised
discarded 44 0.61 Moderate 0.14 To be
19 0.48 Moderate 0.19 To be revised discarded
20 0.32 Moderate 0.41 Very Good 45 0.34 Moderate 0.35 Very Good
Item Item
21 0.16 Difficult 0.49 Very Good 46 0.43 Moderate 0.20 To be revised
Item 47 0.31 Moderate 0.45 Very Good
22 0.86 Very Easy -0.14 To be Item
discarded 48 0.44 Moderate 0.28 To be revised
23 0.19 Difficult 0.48 Very Good 49 0.32 Moderate 0.40 Very Good
Item Item
24 0.90 Very Easy -0.23 To be revised 50 0.40 Moderate 0.39 Very Good
25 0.80 Easy -0.08 To be Item
discarded 51 0.54 Moderate 0.19 To be revised
26 0.21 Difficult 0.50 Very Good 52 0.30 Moderate 0.41 Very Good
Item Item
27 0.17 Difficult 0.49 Very Good 53 0.29 Difficult 0.30 Very Good
Item Item
28 0.65 Moderate 0.12 To be 54 0.24 Difficult 0.44 Very Good
discarded Item
29 0.77 Easy -0.07 To be 55 0.12 Very Difficult 0.47 Very Good
discarded Item
30 0.47 Moderate 0.24 To be revised 56 0.33 Moderate 0.32 Very Good
31 0.85 Easy -0.15 To be Item
discarded 57 0.32 Moderate 0.36 Very Good
32 0.07 Very Difficult 0.57 Very Good Item
Item 58 0.32 Moderate 0.43 Very Good
33 0.64 Moderate 0.13 To be Item
discarded 59 0.24 Difficult 0.47 Very Good
34 0.32 Moderate 0.38 Very Good Item
Item 60 0.23 Difficult 0.45 Very Good
35 0.60 Moderate 0.22 To be revised Item
36 0.11 Very Difficult 0.59 Very Good
Item Statement of the Problem No. 4
37 0.16 Difficult 0.54 Very Good Table 2 provides key metrics for the Psychology Aptitude
Item Test for incoming psychology students, showing that the
38 0.71 Easy -0.02 To be test is both reliable and well-structured. With 363 students
discarded completing the test and a full 100% response rate, the data
https://theshillonga.com/index.php/jhed Page | 88
Digon and Alvarado Journal of Humanities and Education Development (JHED)
J. Humanities Educ. Dev.- 7(2)-2025
is complete and representative of the group. One of the The test consists of 30 carefully selected items, allowing
test's strongest indicators is its internal consistency, for a thorough evaluation of students' psychological
reflected in a high Cronbach's Alpha (KR-20) score of aptitude while avoiding unnecessary redundancy. Research
0.892. Since a reliability score above 0.70 is generally on test development emphasizes that balancing the number
considered acceptable (Nunnally & Bernstein, 1994), this of items and their quality leads to more accurate
result suggests that the test consistently measures what it is assessments and prevents test fatigue (Thorndike &
designed to assess. A strong internal consistency like this Thorndike-Christ, 2010).
ensures that students' scores are not random but instead
reflect their actual aptitude for psychology (Anastasi &
Urbina, 1997).
Table 2. Reliability of Psychology Aptitude Test
Scale n % Cronbach’s Alpha (KR20) N of Items
Cases Valid 363 100
Psychology Aptitude Test .892 30
Excluded 0 .0
Total 363 100
Statement of the Problem No. 5 norming makes test scores more meaningful by showing
The Psychology Aptitude Test for incoming first-year how an individual's performance compares to others. The
college students was developed after refining the test stanine system, which is commonly used in standardized
through a careful selection process. Initially, it contained assessments, helps simplify score interpretation while
60 items, but after analyzing how well each question maintaining accuracy (Thorndike & Thorndike-Christ,
measured what it was supposed to—by looking at 2010). Since the test was reduced to 30 items through
difficulty levels, discrimination power, and overall detailed analysis, only the most reliable and relevant
reliability—the test was reduced to 30 stronger items. questions were kept. This process follows established
This refined version was then given to a new group of research on test development, highlighting how refining
387 students to establish a standard way of interpreting questions improves accuracy (Nunnally & Bernstein,
scores. The results were organized using the Stanine 1994). Research also supports the idea that stanine scores
scale, a standard system in education that divides test can help predict academic performance, making them
scores into nine levels, making it easier to compare helpful in identifying students who might struggle or
individual results to the larger group. excel (De Ayala, 2009).
The stanine system classifies students into three Table 3. Psychology Aptitude Test Norm for Incoming
categories: above average, average, and below average. College Students
As indicated in Table 3, those who scored 19 and above Raw Score Stanine Interpretation
were placed in the above-average range, meaning they 23 and above 9 Above Average
likely have strong psychological aptitude and critical
thinking skills that could help them excel in psychology- 21-22 8 Above Average
related subjects. Most students, who scored between 13 19-20 7 Above Average
and 18, fell within the average range, indicating they have 17-18 6 Average
the foundational skills necessary for success but may
15-16 5 Average
perform at varying levels. Meanwhile, students who
scored 12 and below were categorized as below average, 13-14 4 Average
suggesting they may need extra support in psychology- 11-12 3 Below Average
related coursework.
9-10 2 Below Average
Establishing a test norm like this is important because it
8 and below 1 Below Average
provides a straightforward way to interpret results based
on a larger sample rather than looking at scores in n=387
isolation. Psychologists and education experts like
Anastasi and Urbina (1997) have emphasized how
https://theshillonga.com/index.php/jhed Page | 89
Digon and Alvarado Journal of Humanities and Education Development (JHED)
J. Humanities Educ. Dev.- 7(2)-2025
IV. CONCLUSIONS AND RECOMMENDATIONS education: A reference for the education of children,
adolescents, and adults with disabilities and other
In summary, the development and validation process of
exceptional individuals (pp. 841-844). Wiley.
the Psychology Aptitude Test has yielded positive [3] Candiasa, I. M., Natajaya, N., & Widiartini, K. (2018).
outcomes, indicating its potential as a valuable tool for Vocational aptitude test. SHS Web of Conferences, 42,
evaluating psychological aptitude in incoming college 00044. http://dx.doi.org/10.1051/shsconf/20184200044
students. By meticulously validating content and selecting [4] Cohen, R. J., & Swerdlik, M. E. (2018). Psychological
items, the test now consists of 30 items with strong testing and assessment: An introduction to tests and
content validity, ensuring they accurately measure the measurement. McGraw-Hill Education.
intended constructs. Analysis of item performance [5] Crocker, L. (1986). Methods for determining the difficulty
index of test items. Journal of Educational Measurement,
revealed a balanced range of difficulty levels, and careful
23(2), 121–132.
prioritization of items with notable discrimination indices
[6] De Ayala, R. J. (2009). The Theory and Practice of Item
emphasizes the test's accuracy in distinguishing between Response Theory. Guilford Press.
individuals with varying levels of aptitude. Moreover, the [7] Ebel, R. L. (1979). Techniques for determining the
high Cronbach's Alpha coefficient affirms the test's discrimination index of test items. Educational and
internal consistency, reinforcing its reliability in Psychological Measurement, 39(4), 647–655.
consistently assessing psychological constructs. In the [8] Graham, J. R. (2016). MMPI-2: Assessing personality and
subsequent timeframe, continuous evaluation, diverse psychopathology (5th ed.). Oxford University Press.
validation studies, and refined discrimination analysis [9] Gupta, H. (2010). Graduate pharmacy aptitude test. Journal
of Pharmacy and Bioallied Sciences, 2(1), 55.
will enhance the test's efficacy and suitability for
http://dx.doi.org/10.4103/0975-7406.62715
assessing psychological aptitude in college settings.
[10] Kuncel, N. R., Hezlett, S. A., & Ones, D. S. (2013).
Recommendations Academic performance, career potential, creativity, and job
performance: Can one construct predict them all? Journal
Following a comprehensive analysis of the development
of Personality and Social Psychology, 86(1), 148-161.
and validation of the Psychology Aptitude Test, practical
[11] Lawshe, C. H. (1975). A quantitative approach to content
recommendations are presented here to enhance its validity. Personnel Psychology, 28(4), 563–575.
effectiveness in evaluating psychological aptitude among [12] Lim, H. A., et al. (2015). The validity of the Modern
incoming college students. Based on the study's findings, Language Aptitude Test (MLAT) in a bilingual context: A
these recommendations aimed to strengthen the test's case study of Singapore. Language Testing, 32(2), 193-
utility and relevance in the college setting. 218. https://doi.org/10.1177/0265532214531319
[13] Lloyd, B. (1991). Test your own aptitude. Long Range
Continuous Item Review and Revision. Regularly
Planning, 24(2), 122. http://dx.doi.org/10.1016/0024-
review and update test items to ensure relevance and 6301(91)90130-g
effectiveness in assessing psychological aptitude. [14] Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric
Expanded Validation Studies. Conduct additional Theory (3rd ed.). McGraw-Hill.
studies with diverse samples to confirm the test's [15] Propp, J. (2005). Self-referential aptitude test. Math
Horizons, 12(3), 35.
effectiveness across different populations and contexts.
http://dx.doi.org/10.1080/10724117.2005.12021810
Predictive Validity Assessment. Conduct a predictive [16] Raza, M. A., Deeba, F., & Faqir, R. (2022). A comparative
validity assessment to examine the extent to which scores analysis of school teachers' teaching aptitude. Global
on the Psychology Aptitude Test predict future academic Educational Studies Review, 7(3), 45–52.
success and retention in the BS Psychology program. http://dx.doi.org/10.31703/gesr.2022(vii-iii).05
[17] Thorndike, R. M., & Thorndike-Christ, T. (2010).
Longitudinal Studies. Undertake studies tracking Measurement and Evaluation in Psychology and Education
students' academic and career outcomes over time to (8th ed.). Pearson.
assess the test's predictive validity.
REFERENCES
[1] Anastasi, A., & Urbina, S. (1997). Psychological
Testing (7th ed.). Prentice Hall.
[2] Bridgeman, B., & Wendler, C. (2016). Fairness in
college admissions testing: Validity and validity
issues. In C. R. Reynolds, K. J. Vannest, & E.
Fletcher-Janzen (Eds.), Encyclopedia of special
https://theshillonga.com/index.php/jhed Page | 90
Digon and Alvarado Journal of Humanities and Education Development (JHED)
J. Humanities Educ. Dev.- 7(2)-2025
Appendix
Sample Test Items
This section presents selected test items from the Psychology Aptitude Test for Incoming College Psychology Students. These
items assess fundamental psychological concepts and critical thinking skills relevant to incoming psychology students.
Item 9
Your friend has a hard time understanding classical conditioning. As a psychology student, how can you explain this concept to
your friend?
a. Whenever you bring a baseball bat home, you take your child to the park to play. As a result, every time your youngster sees
you bring home a baseball bat, he becomes delighted because he associates your baseball bat with a trip to the park
b. Your parents gave you a reward every time you get a high score in an exam
c. Every time you clean your room, your mother rewards you with 1000 pesos
d. All of the above
Item 42
Which kind of concept do psychometricians employ when determining the status of the criterion in relation to a test’s accuracy?
a. Predictive Validity
b. Content Validity
c. Concurrent Validity
d. Construct Validity
Item 60
What is the probability of rolling two dice and getting a sum of 7?
a. 1/6
b. 1/12
c. 1/36
d. 6/36
https://theshillonga.com/index.php/jhed Page | 91