Intelligent Highway Adaptive Lane Learning System in Multiple ROIs of Surveillance Camera Video
Intelligent Highway Adaptive Lane Learning System in Multiple ROIs of Surveillance Camera Video
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
8592 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 25, NO. 8, AUGUST 2024
automatic Multiple ROI is very limited, such as in Lee et al. features in a data-driven way, achieving superior results. Based
[13], which used automated Multiple ROIs for the specific on the availability of lane marking annotations, these algo-
task of rain variation. The bulk of research is single ROI rithms can be classified as supervised or unsupervised learning.
with manual ROI annotation, exemplified by Mandal and Adu- The popular supervised ones are SCNN [23], RESA [24],
Gyamfi [14], who used manually defined ROIs for vehicle LSTR [25], BézierLaneNet [26]. Among unsupervised and
tracking. Reference [15] generated a single ROI for each traffic semi-unsupervised methods, some attempts have gained good
direction on highway, but their ROI is on road level and has no accuracy in lane detection [27], [28]. These methods achieved
function to obtain the optimal vehicle detection area. Unique to good performance in urban roads and vehicle-mounted cam-
our work, we automatically determine multiple ROIs (MROI) eras.
for lane detection using ML vehicle detection performance However, existing supervised deep learning algorithms
on surveillance cameras. Our MRLL system, independent of depend on time-consuming pixel-level annotations for lane
camera specifics, boosts accuracy by leveraging ML-detected boundaries and types. Most of the existing lane line detection
vehicle motion and automatically finds MROI. We also employ datasets are generated in urban or highway scenes and vehicle-
a Continual Learning strategy to enhance lane finding and mounted views [29], such as Tusimple benchmark dataset [30]
vehicle counting in diverse real-world conditions. and CULane Dataset [23]. More diverse datasets are needed,
Using MROI for different lanes significantly enhances lane capturing varied traffic, weather, lighting, camera zoom angles,
detection accuracy compared to single ROI. The MROI is lane marking conditions from both urban and highway roads,
auto-determined via high-confidence ML vehicle detection especially from surveillance videos, not just vehicle-mounted
zones. Our first major contribution is the automated, multiple cameras.
ROI lane learning (MRLL) system which eliminates the need We compare our result with 3 other direct lane detection
for any lane position user input. MRLL achieves superior methods: anchor-based lane extraction (LaneATT) [31], lane
results on 45 real-world varied difficulty test videos. instance and line shape prediction (CondLaneNet [32] and
This paper is structured as follows: Section II reviews combining convolutional neural network (CNN) and the recur-
related works. Section III details our proposed method. rent neural network (RNN) to extract both temporal and spatial
Section IV outlines experimental settings, while Section V lane features in [33]. Our lane detection result in Section V
offers results and their analysis. Section VI discusses our shows clear subjective and objective superiority compared to
method’s merits, drawbacks, and comparisons. Section VII the SOTA on real-world video data.
concludes and suggests future improvements.
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
QIU et al.: INTELLIGENT HIGHWAY ADAPTIVE LANE LEARNING SYSTEM IN MULTIPLE ROIs 8593
knowledge of total lane count, (2) they use GMM models to • Highway lane detection on surveillance videos based on
extract features which are affected by illumination changes lane marks is not robust over various surfaces, weather, and
or unstable camera motion, (3) they rely on vehicle tracking lighting conditions.
and trajectory clustering methods which we found have high • Analysis of traffic information for a road lane can be
computational cost compared to MRLL. They also fail in obtained at one location on the road and does not require
learning lane centers during severe ID switches common in traffic to be tracked along the whole road in the image.
congested traffic, poor lighting, or heavy occlusions, (4) none • The camera installation location is often constrained due
of them use multiple ROIs to deal with varying accuracy of to environmental limitations and is not optimized for lane
vehicle detection in the whole frame and (5) none of them detection.
can deal with extreme cases when too few vehicles are on the • Vehicles, in general, are much more likely to move within
road. MRLL addresses all of these limitations. their current lanes than to change to adjacent lanes.
• The road width on US highways is standardized.
C. Vehicle Detection • The average widths of passenger cars and trucks are
different.
Traditional machine vision object detection has been • Lanes of highways with high lane counts and lanes far
replaced by DL-based object detection. YOLO (You Only apart on the image often appear in different regions of the
Look Once) [38] was a game-changer, using a single net- image.
work for object classification and location which is potential • The traffic status of a closed lane (without moving
in real-time video applications. Successive improvements vehicles) need not be detected.
resulted in YOLOv2 [39], YOLOv3 [40], YOLOv4 [41], • CNN-based ML object detectors have different detection
YOLOv5 [42], and YOLOv7 [43], YOLOv7 still being the best performances in different parts of a video frame. For example,
SOTA choice for inference at high-resolution. With our data vehicle detection is most accurate within certain regions where
we compared YOLOv3, YOLOv4, YOLOv5, and YOLOv7 the details of the whole vehicle and their adjacency related to
and confirmed the best object detection performance using other vehicles are visible.
YOLOv7. The MRLL is agnostic to choice of object detector. • Optimal ROIs for vehicle detection adaptive to camera
views are essential for vehicle counting, flow rate estimation,
III. T HE P ROPOSED A DAPTIVE M ULTIPLE and traffic incident reporting.
ROI S L ANE L EARNING S YSTEM Our lane detection method, based on tracking numerous
We aim to create an efficient lane learning system moving vehicles, has contingencies for situations like severe
for numerous highway surveillance cameras, adaptable to traffic jams with static vehicles and sparse night traffic. The
operator-controlled PTZ changes, varying road surfaces, and lane learning time will adjust until a minimum of λ moving
lighting conditions. Existing lane marking-based detection vehicles per lane is identified. The framework of the proposed
methods fall short, so we developed a system using vehicle system is built on two significant modules as shown in Fig. 2:
detection and motion tracking over consecutive video frames. Module 1) Continual Learning Process and Module 2) Lane
To ensure robust and reliable results, we process the MRLL Learning Processing. Module 1 of the system generates
until enough vehicles are detected in each lane across all ROIs. optimal ROIs per road direction and locates the precise lane
Our MRLL approach is based on the following requirements centers within each ROI using data from accumulated vehicles.
and observations in videos: The steps include vehicle detection, road segment creation,
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
8594 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 25, NO. 8, AUGUST 2024
Fig. 3. (a) Detection confidence score at each y-pixel position (b) Confidence score distribution in the whole Frame (c) Learned road segments (d) The
mixed distribution combined both the average confidence score and count in each y-pixel position of the bottom half part of the rightmost road segment and
the estimated baseline and ROI. (e) The determined valid road segments, multiple ROIs and their baselines (f) Histogram of vehicle data in the rightmost
ROI. (g) Lane centers in multiple ROIs. (h) Data clustering to lanes. (i) Lane learning in multiple ROIs (each ROI is generated and marked with a green
rectangle in each road segment).
ROI and baseline determination, and lane center detection. learning until enough vehicles per lane are identified. This is
Any object detector providing positions and confidence scores particularly beneficial in low traffic scenarios, like at night.
can be used to identify ROIs in the video frame. Optimal Each learning cycle lasts a minute, stopping once each lane’s
rectangular areas are proposed as ROIs. The ROI-down line accumulated vehicle data reaches λ (experimental results show
is placed where detection confidence is highest. The counting that λ=100 is optimal for accuracy and efficiency trade-
baseline and ROI-up line are decided based on detection confi- off). Better lighting improves vehicle detection, reducing the
dence and median car height. The ROI’s horizontal boundaries number of needed learning cycles. However, in low visibility
are determined by the road segment’s horizontal range. Lastly, or night conditions, we adaptively use more cycles when the
lane centers are found using our LCD method [12], leading ML object detection confidence is low.
to lane curve generation. Module 2 completes lane learning
using lane centers and vehicle data from Module 1, including 1) Vehicle Detection and Data Collection: Vehicles (cars
steps like data clustering, lane curve fitting, and determining and trucks) are detected in the entire frame, each marked with
lane direction and boundaries. a bounding box, class name, and detection confidence score.
During a cycle time, T (set at one minute), all detected vehi-
cles at each y-pixel position i are accumulated in a list, V , with
A. Module 1: Continual Lane Learning Process each vehicle V _i associated with a vector [x, y, w, h, f id, c],
We propose to use Continual Lane Learning which accu- where x and y are its center coordinate, w and h are width
mulates vehicle information over time to determine the best and height of the bounding box in terms of pixels, f id is the
lane boundaries. It is an iterative process, where the data frame index and c is the detection confidence score. The data is
is evaluated at one minute intervals, accumulating from the displayed in Fig. 3(a), showing average (blue) and distributed
start. Parameters and lanes learned will be updated based on (red) detection confidence scores by the vertical position. The
the current result and the past learning. Specifically in our scores are lower at the image’s top (y=0) and bottom due to
work, the process ensures robust lane detection by continuing smaller or partially occluded vehicles.
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
QIU et al.: INTELLIGENT HIGHWAY ADAPTIVE LANE LEARNING SYSTEM IN MULTIPLE ROIs 8595
2) Valid Road Segment Generation: Valid road segments Algorithm 1 Algorithm of Automatically Finding Multiple
come from vehicle detection, and at different camera angles ROIs
and distances the vehicle shape and size varies. CNN-based Require: Road_segments: valid road contour sets n: number
vehicle detectors perform well on some video frame parts but of Road_segments V: accumulated vehicle data till the
not all. As shown in Fig. 3(a), detection confidence scores current cycle
indicate which image segments yield reliable vehicle detection index ← 0
results. The subsequent paragraphs and Fig. 3 detail the ROI Baselines ← an empty set
trapezoid generation process. In the first step, we create a R O I s ← an empty set
binary image of the same size as the original input image. while index < n do
We initially set each pixel’s value to zero and then accumulate V eh_heights ← an empty set ▷ to store vehicle
the number of vehicles detected on each pixel over time. height value inside Road_segmentsindex
Using each vehicle data vector Vi in V , we create a binary (le f tmostx ,le f tmost y ), (rightmostx ,rightmost y ),
image: pixels within a vehicle’s bounding box are set to (topmostx ,topmost y ), (bottommostx ,bottommost y ),
one. To prevent areas near road boundaries from connecting, (cx ,c y ) ← leftmost, rightmost, topmost, bottommost, and
we reduce each vehicle’s bounding box width to 25% (based center coordinates of Road_segmentsindex
on experimental results) of its original size. For this task, con f ← an empty set
we set a high confidence threshold. For daytime videos, the count ← an empty set
threshold adapts and is set to the 25th percentile of the video’s for y ← topmost y to bottommost y do
confidence scores, as depicted in Fig. 3(b). For nighttime con f y ← an empty set ▷ create a sub-empty set in
cases, the threshold is set to the 30th percentile due to lower con f
confidence in low light. Fig. 3(c) displays the binary image, count y ← 0 ▷ create sub-item in count and
where the black region shows untrustworthy vehicle detection initialize to 0
areas, and the white depicts reliable ones. end for
3) Multiple ROIs and Baselines Determination: For each for v ∈ V do
non-overlapping road segment, we generate two distributions: if le f tmostx < v_x < rightmostx & topmost y <
Con f _avg(y), representing the relationship between y-pixel v_y < bottommost y then
position and average confidence score, and Count (y), reflect- append v_h to V eh_heights, append v_c to
ing vehicle count at each y-pixel position. Assuming these are con f y
independent, and the reference line should be near the road count y ← count y + 1
segment that is closer to the camera (bottom), we determine delete v from V
the reference line using Equation 1: end if
end for
baselinek = arg max Con f _avg(y) ∗ Count (y) (1) con f _avg ← an empty set
y>R_cen_yt
count_total ← an empty set
We define the y-pixel position at which the mixed distribution for y ← topmost y to bottommost y do
Con f _avg(y) ∗ Count (y) gets the maximum value with the append count y to con f _avg ▷ add average
constraint of y > R_cen_yt as the y-pixel position of the confidence score of con f y to con f _avg
reference line and then we name it as the baseline. Based on append count y to count_total
the baseline, we further determine the ROI range following end for
Equations 2 and 3. median_height ← median(V eh_heights)
(
baselinet − R R if baselinet − R R > 0 mi x_distribution ← con f _avg × count_total
R O I _up = obtain baseline, R O I _up, R O I _down using
0 otherwise
Equation 1, 2, 3 respectively
(2) append baseline to Baselines
(
baselinet + R R if baselinet + R R > B M append R O I _up,R O I _down to R O I s
R O I _down = index ← index + 1
bottommost y otherwise
end while
(3) return Baselines, ROIs
where B M = bottommost y , R R as the abbreviation for
ROI range, and is defined as R R = 4.5 ∗ median_height,
median_height, and bottommost y are the median value of segment in Fig. 3(e) at various y pixel positions of the image
all detected vehicles in the t_th road segment and the bottom frames. The green vertical line marks the y-pixel position
y-pixel position of the road segment respectively. The ROI of the ROI up and the pink vertical line marks the y-pixel
range is set to nine times the median car height detected in position of the ROI down, and the red vertical line shows the
the road segment to cover entire vehicles. If a road segment’s y-pixel position of the baseline. The black vertical line marks
y-range is less than nine times the median car height from the y-pixel position of the center of the corresponding road
the whole frame, it’s deemed a false positive with no ROI. segment shown as the rightmost blue contour in Fig. 3(e).
Fig. 3(d) shows the result of Equation 1 for the rightmost road The details of this process can be found in Algorithm 1.
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
8596 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 25, NO. 8, AUGUST 2024
4) Lane Center Detection at All Baselines: Vehicle center 2) Lane Curve Fitting: Instead of directly linking clustering
positions in each ROI are analyzed to create lane histograms, centers at each y-pixel position, we use polynomial curve
using areas tuned to actual road boundaries. Each vehicle fitting to derive lane lines, fitting a second-degree polynomial
within the ROI contributes to the histogram. The peak of of the form x = a ∗ y 2 + b ∗ y + c to the clustered data of each
each histogram indicates the lane center. The first row of lane, where x and y are the centroids of detected vehicles.
Fig. 3(f) shows the histogram of the rightmost ROI in 3) Lane Direction and Indexing Generation: For each lane,
Fig. 3(e). The peaks of these histograms are lane centers. we determine the direction by analyzing vehicle data. Assume
The Fig. 3(g) shows the lane centers on the baseline as Vi (x, y, w, h, f id, c) and V j (x, y, w, h, f id + 1, c) represent
red dots on the yellow lines. During each learning cycle, the same vehicle detected in two consecutive frames, the
after generating lane centers, we check the vehicle count moving distance in the vertical axis can be shown as the
in each lane cl : Scl . Vehicles are selected from a range difference of Vi and V j . If the difference of Vi and V j in
near the baseline [baseline − median_height/2, baseline + vertical coordinates between two consecutive frames is less
median_height/2], where median_height is the median than 25% of the vehicle’s size, we consider it the same
value of the height of vehicles inside the corresponding ROI. vehicle. Movement direction is based on the change in vertical
If the count surpasses a threshold: Scl ≥ λ, lane learning coordinates, regardless of slight variations in bounding box
stops; otherwise, it continues with further detection and data height. Thus, we obtain a binary up (+1) or down (−1)
collection. direction. Assume there are total K pairs of vehicles are used
for lane direction calculation, the direction Dl can be shown
as follows:
B. Module 2: Lane Learning Processing
K
X
(V j (y) − Vi (y)) < 0
−1 i f
1) Vehicle Data Clustering to Each Lane: Using baseline
Dl = (8)
and lane center detects, we assign each vehicle to a driv-
i=1
ing lane using a novel density-based clustering algorithm: 1 otherwise
clustering data along a curve. The clusters are initialized
. The learned lane directions are drawn as blue arrows with
as lane centers and the method is highly efficient, cluster-
different orientations on each lane center position, as shown
ing large datasets quickly. It is applied per lane in each
in Fig. 3(i). Lane indexing varies with direction. To the right,
ROI, beginning at the baseline with the initial lane centers
indexing starts at 1 and increases towards the road’s right.
c1 , c2 , . . . .cl , . . . .ck , and sorting out the vehicles whose verti-
Conversely, to the left, it begins at -1 and decreases towards
cal position equals y, and store them in the list Vy . . For each
the road’s left. For instance, as in Fig. 3(i), with three right
Vy , we calculate the absolute distance of the x-pixel position
lanes, indexes are 1, 2, 3, and for four left lanes, they are -1,
of each vehicle V j in Vy and every center cl , and the center
-2, -3, -4.
m with the shortest distance the center V j should be grouped
4) Lane Boundary Generation: As we are interested in
to:
counting vehicles in each lane, lane boundaries detection
|V j (x) − cm | = min (|V j (x) − cl |) (4) is necessary for the correct vehicle-lane assignment. For
l=1,2..k each lane cl , we find a left boundary and right bound-
ary at each position y in the ROI range of [baseline −
Then
R O I _up, baseline + R O I _down]. The specific calculation
V j ∈ Cm (5) is:
cl − median_width y if l = 0
After that, the x-pixel position of every cluster center and cl [y][le f t_bound] = cl + cl−1
its corresponding data container are updated for each vertical otherwise
2
pixel in the region (except at the baseline) as follows: (9)
Pn
q=1 Vlq (x) cl + median_width y if l = k-1
Cl is not empty cl [y][right_bound] = cl + cl+1
cl = n (6) otherwise
cl otherwise 2
Cl = ∅ (7) (10)
where median_width y is the vehicles’ median width at the
where Vlq is the vehicle grouped into center cl . At last, the
y-pixel position and it is generated based on the data in Vy
updated cluster centers will be used for vehicle clustering at
of each ROI. For each lane, a pair of lane boundaries are
the next y-pixel position. The clustering is an iterative process
generated and drawn as pink curves as shown in Fig. 3(i).
from Equations 4 to 7 which does not stop until all the y
pixel positions in the range [baseline − R O I _up, baseline +
R O I _down] are checked. Fig. 3(h) displays an example of IV. E XPERIMENTAL S ETTING AND T EST DATA
data clustering in three ROIs. The original detected vehicle The Indiana Department of Transportation’s 600 cameras
data (drawn in pink) will be marked with different color when monitor traffic across the state. These cameras offer nine
it is grouped to different lanes. presets (northbound near and far, southbound near and far,
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
QIU et al.: INTELLIGENT HIGHWAY ADAPTIVE LANE LEARNING SYSTEM IN MULTIPLE ROIs 8597
Fig. 4. Correct Lane Learning Results in Various Situations. Column 1: Sunny and easy cases; Column 2: Congested traffic cases; Column 3: Complicated
road structure cases; Column 4: Harsh weather cases; Column 5: Night time cases.
Fig. 5. (a) Perfect Lane Center and Direction Detection (b) Missing Lane
Center Detection and Extra Lane Center Detection (c) Wrong Lane Direction Fig. 6. Unsatisfactory Lane Learning Cases.
Detection (d) Correct Lane Boundaries Pairs and Wrong Lane Boundaries
Pairs.
learning and continuous learning. It even performs well in
eastbound near and far, westbound near and far, plus a home heavy traffic or dark nights with fewer vehicles. Our system
direction) and adjustable viewing angles and zoom levels. demonstrates robust lane detection performance in diverse
We tested our algorithm on 45 video samples from these scenarios.
cameras grouped into five categories: Sunny, Rainy, Snowy,
Congestion, and Night. The algorithm uses YOLOv7 object
B. Quantitative Analysis
detector pre-trained on MS COCO data. The model only
retains detected objects with a confidence of 0.213 or higher 1) Evaluation Criteria and Matrices: To evaluate our lane
and with type of cars and trucks. Performance varies due to detection, we compare the detected lane centers to the ground
factors like lighting, weather, and traffic congestion. After truth. If a lane center is detected, it’s a true positive (TP) as
vehicle detection, the bounding box locations are used to shown in Fig. 5(a), if missing, a false negative (FN) as shown
generate final lane center learning. We compared our system’s in Fig. 5(b) a red circle in the middle road marking one lane
lane detection with single ROI detection and human-labeled center detection is missing, and if an extra lane is detected,
ground truth, and ran it on a computer with a 3.80 GHz a false positive (FP) as shown in Fig. 5(b) a red circle on the
16-core CPU, 32 GB RAM, and a Quadro RTX 5000 GPU. left road marking one extra lane center is detected. If any half
of a pair of lane boundaries is incorrect, it’s considered a FN
V. R ESULTS or FP detection. We measure performance with an F1_score
A. Qualitative Analysis calculated from precision and recall that are defined as shown
in Equations 10 to 13. The F1_score includes lane center
Our MRLL system, compatible with all YOLO versions,
detection ratio and lane direction accuracy (LDA). The LDA
performed best with YOLOv7. We tested various road con-
is determined as the ratio of lanes with accurate lane direc-
ditions and traffic scenarios, generating road segments, ROIs,
tion detection T otal_T P_Lane_dir ection_N umber and T P
baselines, and lane centers. In each category, examples from
lane center detection T otal_T P_Lane_N umber as shown in
successful testing results with YOLOv7 are shown in Fig. 4
Equation 14. To compare our method to previous single ROI
with similar road conditions grouped in the same column. The
detection, we use the same F1_score and calculate lane center
algorithm performs well in ideal lighting, weather, and camera
accuracy across the whole frame.
angles, resulting in accurate vehicle detection. Even in harsh
conditions like night or fog, or when roads are far from the TP
camera, our system proves effective due to multiple ROI lane Pr ecision = (11)
T P + FP
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
8598 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 25, NO. 8, AUGUST 2024
TABLE I
O UR MRLL L ANE L EARNING S YSTEM P ERFORMANCE W ITH YOLOV 7 C OMPARED W ITH G ROUND T RUTH , S INGLE ROI
M ETHOD (LCD) AND MRLL W ITHOUT C ONTINUOUS LEARNING (S-MRLL: S INGLE C YCLE MRLL)
TABLE II
AVERAGE E XECUTION T IME OF E ACH S TEP W ITH YOLOV 7 IN S ECONDS
TP
Recall = (12)
(T P + F N ) ∗ 0.5
Pr ecision ∗ Recall
F1_scor e = (13)
Pr ecision + Recall
T otal_T P_Lane_dir ection_N umber
LDA = (14)
T otal_T P_Lane_N umber
For the time cost during cycle learning data collection, we can
use the cycle number to evaluate. Fewer cycles mean faster Fig. 7. Lane Center Detection(LCD) accuracy, Lane Boundary Detec-
lane detection performance and easier traffic scenes with more tion(LBD) accuracy and Average Learning Cycles(one minute for one cycle)
vehicles. with different λ values on day and night videos.
2) Lane Detection Performance With Different YOLO Ver-
sion: Table I showcases the lane detection accuracy and time
cost of YOLOv7 in various conditions. Our MRLL system step, taking over 1 second. Other steps take less than 0.3 sec-
outperformed the single ROI lane center detection (LCD) onds. Data clustering performance is linked to the number of
method, with the lowest accuracy being 2.47% higher in vehicles, hence in night cases, total time cost is the lowest,
rainy situations. It achieved the highest F1_score of 0.963 in while in congestion, it is the highest.
congestion cases. MRLL detected over 24% more lanes than 4) Setting λ Value: The learning duration in our model
LCD in sunny conditions, indicating its superior utility in depends on traffic volume and the λ value: larger λ or low
traffic surveillance. Across all cases, lane direction accuracy traffic volume means longer learning time. To find the optimal
exceeded 94% and the F1_score of lane boundaries was over λ, we tested 5 values (50, 100, 150, 200, 250) on 10-day
0.88. Night conditions required more cycles due to fewer and 10-night cases. Results are shown in Fig. 7. We see that
vehicles and lower detection accuracy. Yet, MRLL proved learning time increases linearly at night with rising λ, but stays
efficient, needing 1-2 cycles in congestion and about 7 at night, nearly constant during the day due to ongoing traffic. Both day
demonstrating its robustness even in challenging situations. and night lane accuracy increases with λ until they plateau
3) Execution Time in Each Step With YOLOv7: Post-data at a limit. These experiments show that the system is most
collection with YOLOv7, we analyzed the execution time of efficient when λ equals 100. To prove the superiority of the
each step. Table II reveals ROI and baseline determination, continual strategy, we implemented the control experiments of
and data clustering to each lane is the most time-consuming lane learning in multiple ROIs on the same 45 videos total of
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
QIU et al.: INTELLIGENT HIGHWAY ADAPTIVE LANE LEARNING SYSTEM IN MULTIPLE ROIs 8599
Fig. 8. Compare our lane boundary detection performance with other supervised lane marking detection methods. Column 1: Original Images; Column 2:
SCNN results [23]; Column 3: RESA results [24]; Column 4: LaneATT results [31]; Column 5: Our results.
7 minutes (based on the results of TABLE II) for each video. it is still possible to provide insightful discussion. Unlike these
This method, S-MRLL, showed performance in TABLE I, with methods that build motion trajectories with both detection and
its average lane center detection accuracy being 19.6% lower tracking algorithms, we only use vehicle detection informa-
than MRLL. Since both lane direction and boundary detection tion within optimal regions (Multiple ROIs) which is more
depend on lane center detection, continual learning is essential. computationally efficient. Furthermore, MRLL automatically
5) Unsatisfactory Lane Detection Performance Analysis: chooses the highest accuracy regions. We compared alternative
We reviewed cases with subpar performance, as displayed in clustering methods such as K-means and RANSAC, and our
Fig. 6. Such incorrect lane detections are tied to complex grouping method is preferred because each cycle does not
scenarios like missed detections due to blurred cameras from need a random initial lane number setting, and we found
rain (Fig. 6(a)), missed lane center and ROI due to unsatisfac- MRLL’s clustering more efficient and robust in various ITS
tory camera angles (Fig. 6(b)), incorrect ROI detection due to environments.
traffic sign occlusion (Fig. 6(c)), and missed vehicle detection Even though our methods have different data requirements
due to low light (Fig. 6(d)). Future enhancements can come from current popular supervised lane detection methods which
from using transfer learning and updating YOLO series object require lane markings to be annotated, we have successfully
detection training with these challenging images. compared MRLL to three SOTA methods on lane boundary
(marking) detection performance using the same video input,
VI. D ISCUSSION same ground truth and estimation criteria. We can assess
Our MRLL method works well in numerous ITS environ- lane boundary detection on our 335 lanes from 45 traffic
ments as shown in Fig. 4. It uniquely uses vehicle object videos with three SOTA models: SCNN [23], RESA [24],
detection and limited motion information, negating the need and LaneATT [31]. We show the comparison results on two
for time-consuming data labeling or road geometry knowl- images in Fig. 8. Compared with these SOTA lane detection
edge. It doesn’t require camera calibration or rely on static algorithms (trained on Tusimple dataset), our method shows
cameras, it is robust to light variations and sensor noise. its superiority and robustness on lane boundary detection in
The system detects lane boundaries separately and adapts to our real-world data. Specifically, in terms of lane boundary
varying vehicle sizes. MRLL is computationally efficient and detection accuracy measured by the correct detection out of
implements a cycle learning strategy for low vehicle density 335 total lanes, SCNN achieved 5.1%, RESA 4.8%, LaneATT
situations. We have compared our MRLL lane detection results 7.8%, while our method reached 72%. We have showed
with human-labeled ground truth on our data and proved its that MRLL, using our real-world data and without any lane
effectiveness. annotations, is quantitatively superior to 3 publicly available
1) Method’s Limitations: However, MRLL does have limi- SOTA lane detectors. Methods based on motion trajectories
tations. It depends on vehicle volume and ML object detection purely extracted with vehicle detection can be more suitable
accuracy. It can be misled by predominantly truck traffic for scenarios with poor lane markings, occlusions or situations
due to truck size and oblique camera angle. The system where traditional lane detection is challenging.
struggles with stationary vehicles and uses some heuristic- 3) Automatically Finding the Best Regions of Interest
based thresholds. The balance between detection accuracy and (ROIs) in Highway Videos Can Offer Several Potential
speed is determined by these thresholds. Even though we Benefits: By identifying and focusing on specific ROIs within
addressed any errors in each MRLL sub-module designed, the the video frames, the deep learning based vehicle detection
whole system’s accuracy is measurable against SOTA Lane and tracking models can reduce the computational load. This
Learning Systems, as seen in Fig. 8. optimization is especially valuable for real-time or resource-
2) Comparison With Other Lane Detection Methods on Our constrained applications, as it can lead to faster processing
Data: Without any lane marking labels, our lane detection and reduced hardware requirements. Targeting specific ROIs
problem is formulated as a motion trajectory clustering and allows the model to concentrate its attention on areas where
fitting problem. The most similar work with ours are [34], vehicles are more likely to appear. This can help reduce false
[35], and [36]. While it is challenging to directly compare positives in detection, resulting in a more accurate and reliable
our work with those methods due to a lack of public code, system even in challenging scenarios such as occlusions or
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
8600 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 25, NO. 8, AUGUST 2024
adverse weather conditions. Accurate and real-time or near- [14] V. Mandal and Y. Adu-Gyamfi, “Object detection and tracking algo-
real-time vehicle detection and tracking in highway videos rithms for vehicle counting: A comparative analysis,” J. Big Data Anal.
Transp., vol. 2, no. 3, pp. 251–261, Dec. 2020.
enhanced by optimal ROIs can contribute to improved traffic [15] H. Ghahremannezhad, H. Shi, and C. Liu, “A new adaptive bidirectional
management such as traffic flow analysis, congestion fore- region-of-interest detection method for intelligent traffic video analysis,”
casting, automatic incident detection and other higher-level in Proc. IEEE 3rd Int. Conf. Artif. Intell. Knowl. Eng. (AIKE), Dec. 2020,
pp. 17–24.
requirements of traffic surveillance.
[16] N. Arshad, K.-S. Moon, S.-S. Park, and J.-N. Kim, “Lane detection with
moving vehicles using color information,” in Proc. World Congr. Eng.
Comput. Sci. (WCECS), vol. 1, San Francisco, CA, USA, Oct. 2011.
VII. C ONCLUSION [17] Q. Lin, Y. Han, and H. Hahn, “Real-time lane departure detection based
on extended edge-linking algorithm,” in Proc. 2nd Int. Conf. Comput.
This paper expands our prior LCD method into the MRLL, Res. Develop., May 2010, pp. 725–730.
a MROI lane learning system that creates lane centers, curves, [18] A. Borkar, M. Hayes, and M. T. Smith, “A novel lane detection system
and boundaries using YOLO object detection. Tested with with efficient ground truth generation,” IEEE Trans. Intell. Transp. Syst.,
45 diverse videos, it achieves an F1_score above 0.79 for lane vol. 13, no. 1, pp. 365–374, Mar. 2012.
[19] C. R. Jung and C. R. Kelber, “Lane following and lane departure
center detection, 0.88 for lane boundaries, and 94% accuracy using a linear-parabolic model,” Image Vis. Comput., vol. 23, no. 13,
in traffic direction. The MRLL, compatible with any vehicle pp. 1192–1202, Nov. 2005.
detector that provides a confidence score, is unique as it [20] W. Y. T. Ek and D. Sben, “Lane detection and tracking using B-snake,
does not rely on lane markings, adapts to camera views, and image and vision computer,” Image Vis. Comput., vol. 22, pp. 269–280,
Jan. 2004.
offers continual learning for reliable results. It is being used [21] K. Zhao, M. Meuter, C. Nunn, D. Müller, S. Müller-Schneiders, and
by the Indiana Department of Transportation in challenging J. Pauli, “A novel multi-lane detection and tracking system,” in Proc.
real-world ITS scenarios. Future extensions include improving IEEE Intell. Vehicles Symp., Jun. 2012, pp. 1084–1089.
vehicle detection with transfer learning algorithms and imple- [22] J. Xiao, W. Xiong, Y. Yao, L. Li, and R. Klette, “Lane detection
algorithm based on road structure and extended Kalman filter,” Int. J.
menting a flexible camera angle-checking strategy. Overall, the Digit. Crime Forensics, vol. 12, no. 2, pp. 1–20, Apr. 2020.
adaptive MRLL framework demonstrates significant potential [23] X. Pan, J. Shi, P. Luo, X. Wang, and X. Tang, “Spatial as deep: Spatial
in traffic surveillance applications. CNN for traffic scene understanding,” in Proc. AAAI Conf. Artif. Intell.,
vol. 32, no. 1, Apr. 2018, pp. 1–8.
[24] T. Zheng et al., “RESA: Recurrent feature-shift aggregator for lane
R EFERENCES detection,” in Proc. AAAI Conf. Artif. Intell., vol. 35, no. 4, 2021,
pp. 3547–3554.
[1] D. Liang, Y.-C. Guo, S.-K. Zhang, T.-J. Mu, and X. Huang, “Lane [25] R. Liu, Z. Yuan, T. Liu, and Z. Xiong, “End-to-end lane shape prediction
detection: A survey with new results,” J. Comput. Sci. Technol., vol. 35, with transformers,” in Proc. IEEE Winter Conf. Appl. Comput. Vis.
no. 3, pp. 493–505, May 2020. (WACV), Jan. 2021, pp. 3694–3702.
[2] J. Tang, S. Li, and P. Liu, “A review of lane detection methods based on [26] Z. Feng, S. Guo, X. Tan, K. Xu, M. Wang, and L. Ma, “Rethinking
deep learning,” Pattern Recognit., vol. 111, Mar. 2021, Art. no. 107623. efficient lane detection via curve modeling,” in Proc. IEEE/CVF Conf.
[3] H. Ghahremannezhad, H. Shi, and C. Liu, “Robust road region extraction Comput. Vis. Pattern Recognit. (CVPR), Jun. 2022, pp. 17062–17070.
in video under various illumination and weather conditions,” in Proc. [27] C. Li, B. Zhang, J. Shi, and G. Cheng, “Multi-level domain adaptation
IEEE 4th Int. Conf. Image Process., Appl. Syst. (IPAS), Dec. 2020, for lane detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern
pp. 186–191. Recognit. Workshops (CVPRW), Jun. 2022, pp. 4379–4388.
[4] H. Li, Y. Chen, Q. Zhang, and D. Zhao, “BiFNet: Bidirectional fusion [28] M. Ghafoorian, C. Nugteren, N. Baka, O. Booij, and M. Hofmann,
network for road segmentation,” IEEE Trans. Cybern., vol. 52, no. 9, “EL-GAN: Embedding loss driven generative adversarial networks for
pp. 8617–8628, Sep. 2022. lane detection,” in Proc. Eur. Conf. Comput. Vis. (ECCV) Workshops,
[5] S. Luo, X. Zhang, J. Hu, and J. Xu, “Multiple lane detection via Sep. 2018, p. 0.
combining complementary structural constraints,” IEEE Trans. Intell. [29] S. Shirke and R. Udayakumar, “Lane datasets for lane detection,”
Transp. Syst., vol. 22, no. 12, pp. 7597–7606, Dec. 2021. in Proc. Int. Conf. Commun. Signal Process. (ICCSP), Apr. 2019,
[6] S.-N. Kang, S. Lee, J. Hur, and S.-W. Seo, “Multi-lane detection based pp. 792–796.
on accurate geometric lane estimation in highway scenarios,” in Proc. [30] S. Yoo et al., “End-to-end lane marker detection via row-wise clas-
IEEE Intell. Vehicles Symp., Jun. 2014, pp. 221–226. sification,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.
[7] M. A. Helala, K. Q. Pu, and F. Z. Qureshi, “Road boundary detection in Workshops (CVPRW), Jun. 2020, pp. 1006–1007.
challenging scenarios,” in Proc. IEEE 9th Int. Conf. Adv. Video Signal- [31] L. Tabelini, R. Berriel, T. M. Paix ao, C. Badue, A. F. De Souza, and
Based Surveill., Sep. 2012, pp. 428–433. T. Oliveira-Santos, “Keep your eyes on the lane: Real-time attention-
[8] H. Kong, J.-Y. Audibert, and J. Ponce, “Vanishing point detection for guided lane detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern
road detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Recognit. (CVPR), Jun. 2021, pp. 294–302.
Jun. 2009, pp. 96–103. [32] L. Liu, X. Chen, S. Zhu, and P. Tan, “CondLaneNet: A top-to-down
[9] H. Ghahremannezhad, C. Liu, and H. Shi, “Traffic surveillance video lane detection framework based on conditional convolution,” in Proc.
analytics: A concise survey,” in Proc. 18th Int. Conf. Mach. Learn. Data IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 3773–3782.
Mining, New York, NY, USA, 2022, pp. 263–291. [33] Q. Zou, H. Jiang, Q. Dai, Y. Yue, L. Chen, and Q. Wang, “Robust lane
[10] J. Tian, Z. Wang, and Q. Zhu, “An improved lane boundaries detection detection from continuous driving scenes using deep neural networks,”
based on dynamic ROI,” in Proc. IEEE 9th Int. Conf. Commun. Softw. IEEE Trans. Veh. Technol., vol. 69, no. 1, pp. 41–54, Jan. 2020.
Netw. (ICCSN), May 2017, pp. 1212–1217. [34] J. Ren, Y. Chen, L. Xin, and J. Shi, “Lane detection in video-based
[11] Y. Shen, Y. Bi, Z. Yang, D. Liu, K. Liu, and Y. Du, “Lane line detection intelligent transportation monitoring via fast extracting and clustering of
and recognition based on dynamic ROI and modified firefly algorithm,” vehicle motion trajectories,” Math. Problems Eng., vol. 2014, pp. 1–12,
Int. J. Intell. Robot. Appl., vol. 5, no. 2, pp. 143–155, Jun. 2021. Jan. 2014.
[12] M. Qiu et al., “Intelligent highway lane center identification from [35] Z. Chen, Y. Yan, and T. Ellis, “Lane detection by trajectory clustering
surveillance camera video,” in Proc. IEEE Int. Intell. Transp. Syst. Conf. in urban environments,” in Proc. 17th Int. IEEE Conf. Intell. Transp.
(ITSC), Sep. 2021, pp. 2506–2511. Syst. (ITSC), Oct. 2014, pp. 3076–3081.
[13] J. Lee, B. Hong, S. Jung, and V. Chang, “Clustering learning model [36] J. Melo, A. Naftel, A. Bernardino, and J. Santos-Victor, “Detection
of CCTV image pattern for producing road hazard meteorological and classification of highway lanes using vehicle motion trajecto-
information,” Future Gener. Comput. Syst., vol. 86, pp. 1338–1350, ries,” IEEE Trans. Intell. Transp. Syst., vol. 7, no. 2, pp. 188–200,
Sep. 2018. Jun. 2006.
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.
QIU et al.: INTELLIGENT HIGHWAY ADAPTIVE LANE LEARNING SYSTEM IN MULTIPLE ROIs 8601
[37] T. Zhang, “Longitudinal scanline based spatial–temporal modeling and Lauren Christopher (Senior Member, IEEE)
processing for vehicle trajectory reconstruction,” Ph.D. dissertation, received the B.S. and M.S. degrees in EECS from
School Graduate Studies, Rutgers State Univ. New Jersey, Piscataway, MIT in 1982 and the Ph.D. degree from Purdue
NJ, USA, 2022. University in 2003. She joined Purdue University,
[38] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look Indianapolis, IN, USA, in 2008. She was with RCA’s
once: Unified, real-time object detection,” in Proc. IEEE Conf. Comput. David Sarnoff Research Laboratories and Thom-
Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 779–788. son Consumer Electronics, where she led the first
[39] J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” in DirecTV receiver design. In 2010, she was inducted
Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, into the Consumer Electronics Hall of Fame for
pp. 7263–7271. leading the development of the DirecTV set-top
[40] J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” box. She is currently teaching ECE courses and
2018, arXiv:1804.02767. heading the Machine Intelligence and Computer Vision in 3D Laboratory in
[41] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal the Department. Her research interests include machine learning, 3D image
speed and accuracy of object detection,” 2020, arXiv:2004.10934. sensors, and computer vision algorithms.
[42] G. Jocher, “Ultralytics/YOLOv5: V6.1—Tensorrt, tensorflow edge
tpu and openVINO export and inference,” Zenodo, Feb. 2022, doi: Stanley Yung-Ping Chien (Life Senior Member,
10.5281/zenodo.6222936. IEEE) received the Ph.D. degree in electrical and
[43] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7: Trainable computer engineering from Purdue University, West
bag-of-freebies sets new state-of-the-art for real-time object detectors,” Lafayette, IN, USA, in 1989. He is currently a
in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2023, Professor with the Department of Electrical and
pp. 7464–7475. Computer Engineering, Purdue University, Indi-
anapolis, IN, USA. His research interests include
vehicle active safety, dynamic load balancing of par-
allel computing, software engineering, and robotics.
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on January 03,2025 at 03:03:10 UTC from IEEE Xplore. Restrictions apply.