Your browser does not support the video tag.
  • Welcome
  • About Us
  • Goals
  • Project
  • Contact Us
  • Data mining project

    A STEP TOWARDS MOVIE REVIEW ANALYSIS FOR MOVIE RATING PREDICTION

    Welcome To our Web page

    Project Description

    Text mining, also known as text data mining or knowledge discovery from textual corpus, refers to the process of extracting interesting and non-trivial patterns or knowledge from text documents. Thus, text analysis to understand sentiment towards different products, entertainments and decision making is gaining popularity. For this data mining project, we want to explore Amazon movie review text data set along with movie rating by different user. Sentiment analysis of movie review can potentially trace important movie rating pattern.

  • About Us

    Group: Fantastic 4
    Who we are?


    Name: Paromita Nitu

    Graduate Teaching Assistant
    MSCS Department, Marquette University

    Email: paromita.nitu@marquette.edu

    Social media:


    Name: Zachary Boyd

    Graduate student
    Department of Biophysics, Medical College of Wisconsin
    AND
    Medical College of Wisconsin

    Email: zachary.r.boyd@marquette.edu
    zboyd@mcw.edu

    Social media:




    Name: Nihel Charfi

    Graduate student
    MSCS Department, Marquette University

    Email: nihel.charfi@marquette.edu

    Social media:




    Name: Matthew Shafis

    Graduate student
    MSCS Department, Marquette University

    Email: matthew.shafis@marquette.edu

    Social media:

  • Goals

    • Movie review classification: Classify each review as positive, negative or neutral review.
    • Movie rating prediction based of review sentiment analysis.
    • Document clustering of movie reviews.

    Project Goals: Amazon movie reviews datasets

    ghjknbvfghjk

      - dfghuijokjhbjvgcfdrtfyguhijjnhbgvcf

  • Our project

    Here you can check our project update

      Data Description:

      -The dataset consists of movie reviews from Amazon.

      -The Amazon Movies Reviews dataset consists of 7,911,684 reviews Amazon users left between Aug 1997 - Oct 2012 about 253,059 products.

      -As per the data format following below are the details of each column name shown below:

      Product/ Product Id:This is a unique generated by Amazon and assigned to a unique movie.

      User Id:The ID of the user.

      Profile Name:The name of the user who found the review useful.

      Score:The column signifies the times of the review.

      Time:The column signifies the times of the review.

      Summary:The summary of the movie.

      Text:The comments and reviews written by the user about the movie.

      Tools:

      -All coding, data manipulation, and processing will be performed using Python v3.6. A number of specific tools and packages have been identified as potentially useful for the goals of the project.

      Packages:

      -The Natural Language Toolkit: The tool kit provides several useful tools when working with text in python, specifically it contains an implementation of a naïve Bayes classifier that can be used in sentiment analysis of the data set.
      Additionally, it contains tools for generating word clouds for easy data visualization when working with text.

      -The scikit-learn machine learning library

      -The pandas and Numpy libraries will likely be used throughout the project for general data handling.

  • Contact us

    You'll be responded within 48 hrs

    Marquette University | Department of Mathematics, Statistics and Computer Science.

    1313 W Wisconsin Ave, Milwaukee, WI 53233

Copyright © 2018 Marquette University