3. The list of articles related to our topic
• Car Price Prediction Using Machine Learning
https://drive.google.com/file/d/1ePbZeFKORcUpLpJbelFivG-
tYquqGLZl/view?usp=share_link
• Used Cars Price Prediction using Machine Learning with Optimal
Features
https://drive.google.com/file/d/1k1S0SNS8RPhYNs-
dDzQ4qyxZ6GjBK5fq/view?usp=share_link
• OLD CAR PRICE PREDICTION WITH MACHINE LEARNING
https://drive.google.com/file/d/1h1fx1cpvgvODYiGmYBw7rfUwtJ_aSc4g/vi
ew?usp=share_link
4. Car Price Prediction Using Machine Learning
• Importance:
To provide an overview of the concept of machine learning, its
evolution, and its potential applications in real-life scenarios. The
abstract explains that machine learning is not a new science, but
advancements in computing technologies have given it fresh
momentum. The ability of machine learning models to independently
adapt and learn from previous computations makes it an important
tool for solving prediction problems and analyzing data.
5. • Aim of it:
The aim is to build a supervised machine learning model that can
forecast the value of a vehicle based on multiple attributes.
Additionally, the system being built must be feature-based, meaning
that feature-wise prediction must be possible. The article also aims to
provide graphical comparisons to provide a better view of the results.
• Materials:
NumPy, SciPy, scikit-learn, Jupyter Notebook and Enthought Canopy
K-MEANS ALGORITHM, DECISION TREE REGRESSION
• Results:
“We found that the root means square error for KNN with k = 7 is
5581.96 and for CART is 4961.64 and actual price was 4999.”
6. • Conclusion
In conclusion, they have successfully implemented a machine learning
model using prominent algorithms from libraries in Python for
predicting the value of a vehicle based on multiple attributes.
Through pre-processing and data cleaning of the dataset, they found
that there is a positive correlation between price and kilometers
traveled and year of registration, and a negative correlation between
price and year of registration. Their model was trained using three
lakh tuples and compared using K Nearest Neighbour (KNN) and
Classification and Regression Trees (CART) algorithms, with CART
proving to be more accurate.
7. • References:
[1].M. Antonakakis, T. April, M. Bailey, M. Bernhard, E. Bursztein, J.
Cochran, Z. Durumeric, J. A. Halderman, L. Invernizzi, M. Kallitsis, D.
Kumar, C. Lever, Z. Ma, J. Mason, D. Menscher, C. Seaman, N. Sullivan,
K. Thomas, and Y. Zhou, "Understanding the mirai botnet," in Proc. of
USENIX Security Symposium, 2017.
[2].Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani,
―Toward Generating a New Intrusion Detection Dataset and Intrusion
Traffic Characterization‖, 4th International Conference on Information
Systems Security and Privacy (ICISSP), Portugal, January 2018
[3].Hossein Hadian Jazi, Hugo Gonzalez, Natalia Stakhanova, and Ali
A. Ghorbani. "Detecting HTTP-based Application Layer DoS attacks on
Web Servers in the presence of sampling." Computer Networks, 2017
[4]. A. Shiravi, H. Shiravi, M. Tavallaee, A.A. Ghorbani, Toward
developing a systematic approach to generate benchmark datasets for
intrusion detection, Comput. Security 31 (3) (2012) 357–374.
8. Used Cars Price Prediction using Machine Learning
with Optimal Features
• Importance:
The importance of this article lies in providing a solution to a common
problem faced by many individuals in the market for personal
vehicles. By utilizing machine learning algorithms and statistical tests,
the proposed method can help both buyers and sellers make
informed decisions about the best price for their vehicle, leading to
more productive and efficient transactions. The study's efficiency and
effectiveness demonstrated by the prediction results provide a
promising solution to this problem.
9. • Aim of it:
To propose a method that can predict the optimal price for buying
and selling personal vehicles based on market value. The study
focuses on three target groups, including used car dealers, individuals
interested in buying or selling a used car, and online web services that
determine the market value of a used car. The proposed method
utilizes machine learning algorithms such as regression analysis, RFE,
and statistical tests to predict the optimal price for a car based on
market value. The aim of the article is to provide a productive and
efficient method that helps both buyers and sellers make informed
decisions about the best price for their vehicle, leading to more
efficient and productive transactions.
Materials:
Random Forest, deep learning model, long short-term memory
(LSTM), convolution neural network architectures, recursive feature
elimination for linear regression, Kaggle for data collection.
10. • Results
The overall results of the study were that the model was found to be
productive and effective in predicting the prices of cars. The R2-Score,
which is a measure of how well the linear regression model fits the
data, was found to be 90%, indicating a good fit. The optimal features
for the model were identified as FUELTYPE, ASPIRATION, CARBODY,
DRIVEWHEEL, WHEELBASE, CARLENGTH, CARWIDTH, CURBWEIGHT,
ENGINETYPE, CYLINDERNUMBER, ENGINESIZE, BORERATIO,
HORSEPOWER, PRICE, BRAND_CATEGORY, and MILEAGE. The study
used label encoding for categorical attributes and rescaled the other
features for machine learning purposes. The model was refined by
using the RFE method to find the weights of the features, dropping
certain features based on their p-value and VIF, and monitoring the
error rate. The results showed that the model was able to make
accurate predictions of car prices.
11. • Conclusion
In conclusion, the authors developed a machine learning model based
on optimal features for predicting the prices of used vehicles. The
study employed various techniques such as RFE, OLS regression, and
VIF to select the best features and improve the model's performance.
The results showed that the model achieved 90% correct predictions,
indicating its effectiveness and efficiency for both customers and
dealers in the used vehicle market. The study suggests that the model
could be useful for supporting customers in selling and buying used
vehicles and helping dealers make profitable deals.
12. • References:
[1] D. Van Thai, L. N. Son, P. V. Tien, N. N. Anh, and N. T. N. Anh,
“Prediction car prices using quantify qualitative data and knowledge-
based system,” in 2019 11th International Conference on Knowledge
and Systems Engineering (KSE), 2019, pp. 1–5.
[2] N. Pal, P. Arora, P. Kohli, D. Sundararaman, and S. S. Palakurthy,
“How much is my car worth? A methodology for predicting used cars’
prices using random forest,” in Future of Information and
Communication Conference, 2018, pp. 413–422.
[3] A. Camero, J. Toutouh, D. H. Stolfi, and E. Alba, “Evolutionary deep
learning for car park occupancy prediction in smart cities,” in
International Conference on Learning and Intelligent Optimization,
2018, pp. 386–401.
[4] T. Struyf et al., “Signs and symptoms to determine if a patient
presenting in primary care or hospital outpatient settings has COVID-
19 disease,” Cochrane Database Syst. Rev., Jul. 2020, doi:
10.1002/14651858.CD013665.
13. OLD CAR PRICE PREDICTION WITH MACHINE
LEARNING
• Importance:
The importance of the article is that it proposes the development of a
platform that utilizes machine learning technology to predict the price
of used cars. This platform can be valuable for new car buyers who
may not have sufficient knowledge about the market price of their
desired car.
14. • Aim of it:
By utilizing supervised machine learning algorithms such as linear-
regression, KNN, Random Forest, XG boost, and Decision tree, the
study aims to build a statistical model that can accurately predict the
price of a used car based on consumer data and a given set of
features.
15. • Materials
1. A dataset containing 92386 records: The dataset was used for training the machine
learning models. It contained various attributes such as kilometers traveled, year of
registration, fuel type, car model, fiscal power, car brand, and gear type that were
used to determine the worth of an automobile.
2. K Nearest Neighbors (KNN) Regressor: This algorithm was used as one of the
machine learning models to predict the price of a used car. The algorithm was
implemented with a neighbor range of 1 to 100.
3. Random Forest Regressor: This algorithm was also used as a machine learning model
to predict the price of a used car. It was chosen to account for the large number of
features in the dataset and compare a bagging technique with the Gradient Boost
method.
4. Linear Regression: This algorithm was used as a baseline algorithm to quickly train
and test the model.
5. XG Boost Regressor: This algorithm was used to improve performance compared to
standard Gradient Boosting using regularization, second order gradients, and added
support for parallel compute.
6. Decision Tree Regressor: This algorithm was used to build regression models and the
structure is in the form of a tree. It was used to break down the dataset into smaller
and smaller subsets based on the information gain value for the each individual
features.
16. • Results
The results of the study indicate that the Random Forest Regressor
algorithm outperformed the other algorithms, including K Nearest
Neighbors (KNN) Regressor, Linear Regression, XG Boost Regressor, and
Decision Tree Regressor. The Root Mean Squared Error (RMSE) on the test
dataset was the lowest for Random Forest Regressor, indicating that it
performed the best. The R-squared value comparison also showed that
Random Forest Regressor had the highest R-squared score, indicating that
the variance between the best-fitted line and the mean line of the dataset
was the lowest compared to the other algorithms.
The residual value vs predicted value comparison graphs (Figures 7-11)
showed that Random Forest Regressor had the least deviation from the
actual against predicted for all five algorithms. The results in Table 1 further
confirm that Random Forest Regressor performed the best with the highest
test accuracy of 93.11% and the lowest RMSE value of 3702.34.
Overall, these findings suggest that the Random Forest Regressor algorithm
is a suitable choice for predicting the worth of an automobile based on
attributes such as kilometers traveled, year of registration, fuel type, car
model, fiscal power, car brand, and gear type.
17. • Conclusion
The conclusion is that the study used five different supervised
machine learning models to predict used car prices, and out of all, the
Random Forest model performed the best with the lowest RMSE and
highest R-squared value. However, the dataset used was relatively
small, and gathering more data and including additional features
could improve the accuracy and stability of the model.
18. • References:
[1] Ashish Chandak, Prajwal Ganorkar, Shyam Sharma, Ayushi Bagmar, Soumya
Tiwari, Car Price Prediction Using Machine Learning, International Journal of
Computer Sciences and Engineering, Volume 7, Issue 5, May 2019.
[2] Durgesh k. Shrivstava, Lekha Bhambhu, “Data Classification Using Support
Vector Machine”, Journal of Theoretical and Applied Information Technology, Sep.
2009.
[3] Pattabiraman Venkatasubbu, Mukkesh Ganesh, “Used Cars Price Prediction
using Supervised Learning Techniques”, International Journal of Engineering and
Advanced Technology (IJEAT), Vol. 9, Issue 1S3, Dec. 2019.
[4] Vrushali Y Kulkarni, Pradeep K Sinha, “Effective Learning and Classification
using Random Forest Algorithm,” International Journal of Engineering and
Innovative Technology (IJEIT), Vol. 3, Issue. 11, May 2014.