Friday, 3 January 2025

What is data explosion with its implication and also explain 5 V' s of Big Data?








 





Wednesday, 1 January 2025

Telephone Book Assignment using Hash Table Implementation (Chaining/ Open Addressing).


Pre-Requisite: 

Before doing programming in C++, we must know which IDE i have to use. Follow just few steps to execute your code:

1. Install Dev-C++

  1. Download Dev-C++:

    • Open your browser visit the official Embarcadero Dev-C++ website or use a trusted source like SourceForge.
    • Download and install the IDE.
  2. Ensure the installer includes the MinGW (GCC) Compiler:

    • Dev-C++ often comes bundled with MinGW, which supports STL

2. Configure Dev-C++

  1. Launch the IDE after installation.

  2. Set the Compiler:

    • Go to Tools > Compiler Options.
    • Check the compiler version and ensure it is modern (preferably GCC 9 or higher for full STL and modern C++ standard support).
  3. Set the C++ Standard (Optional but Recommended):

    • Under Compiler Options, add the following flags in the "Add the following commands when calling the compiler"
    • -std=c++17  

3. Create a New Project

  1. Go to File > New > Project.
  2. Select "Console Application" and set the language to "C++."
  3. Save the project in your desired location.


Let's Start Implementation:

Problem Statement: 

Consider telephone book database of n clients make use of a hash table implementation to quickly look up client's telephone number. Make use of two collision handling techniques and compare them using number of comparisons required to find a set of telephone numbers.

Algorithm:

Step 1: Start
Step 2: Define a Hash Table class with the following components:
        Step 2.1: For chaining, use an array of linked lists to handle collisions.
        Step 2.2: For open addressing, use an array to store keys directly, along with a status array to track filled slots.

Step 3: Hash Function:
        Step 3.1: Use a simple modulo operation to calculate the hash index.

Step 4: Insert a client's phone number:
        Step 4.1: Compute the hash index using the hash function.
        Step 4.2: Handle collisions using the chosen technique.

Step 5: Search for a client's phone number:
        Step 5.1: Compute the hash index and probe the table until the number is found or confirmed absent.

Step 6: Compare the techniques:
        Step 6.1: Measure the number of comparisons for both techniques when looking up a set of numbers.
Step 7: Stop.


Code 1:

1. Hash Table with Chaining:

#include <iostream>
#include <cstring>
using namespace std;

#define TABLE_SIZE 10
#define MAX_CHAIN_SIZE 5 // Maximum number of entries in a chain

class HashTable {
private:
    string names[TABLE_SIZE][MAX_CHAIN_SIZE];  // 2D array for names
    string phones[TABLE_SIZE][MAX_CHAIN_SIZE]; // 2D array for phone numbers
    int chainSize[TABLE_SIZE];               // To track the number of entries in each bucket

    // Hash function
    int hashFunction(const string& key) {
        int hash = 0;
        for (char c : key) {
            hash += c;
        }
        return hash % TABLE_SIZE;
    }

public:
    // Constructor
    HashTable() {
        memset(chainSize, 0, sizeof(chainSize));
    }

    // Insert function
    void insert(const string& name, const string& phone) {
        int index = hashFunction(name);
        if (chainSize[index] < MAX_CHAIN_SIZE) {
            names[index][chainSize[index]] = name;
            phones[index][chainSize[index]] = phone;
            chainSize[index]++;
        } else {
            cout << "Error: Bucket overflow! Cannot insert " << name << ".\n";
        }
    }

// Search function with static variable to count comparisons
string search(const string& name) {
    static int totalComparisons = 0; // Static variable to keep track of total comparisons
    int comparisons = 0;            // Local variable for comparisons in this call

    int index = hashFunction(name);
    for (int i = 0; i < chainSize[index]; ++i) {
        comparisons++;
        totalComparisons++; // Increment the static variable
        if (names[index][i] == name) {
            //cout << "Comparisons for this search: " << comparisons << endl;
            cout << "Total comparisons so far: " << totalComparisons << endl;
            return phones[index][i];
        }
    }

    //cout << "Comparisons for this search: " << comparisons << endl;
    cout << "Total comparisons so far: " << totalComparisons << endl;
    return "Not Found";
  }
  
      void display() {
        cout << "Hash Table Contents:\n";
        for (int i = 0; i < TABLE_SIZE; ++i) {
            cout << "Bucket " << i << ": ";
            if (chainSize[i] == 0) {
                cout << "Empty\n";
            } else {
                for (int j = 0; j < chainSize[i]; ++j) {
                    cout << "[" << names[i][j] << ": " << phones[i][j] << "] ";
                }
                cout << "\n";
            }
        }
    }


};

int main() {
    HashTable hashTable;

    int n;
    cout << "Enter the number of clients: ";
    cin >> n;

    // Insert data into the hash table
    for (int i = 0; i < n; ++i) {
        string name, phone;
        cout << "Enter name of client " << i + 1 << ": ";
        cin >> name;
        cout << "Enter phone number of client " << i + 1 << ": ";
        cin >> phone;
        hashTable.insert(name, phone);
    }
cout<<"\n";
cout<<"\n Let's Search name from Hash Table:\n";
    // Search for a key
    string searchName;
    cout << "Enter the name to search for: ";
    cin >> searchName;
    string result = hashTable.search(searchName);
    if (result != "Not Found") {
        cout << "Phone number of " << searchName << ": " << result <<endl;
    } else {
        cout << searchName << " not found in the telephone book.\n";
    }
    cout<<"\n";
    cout<<"\nHash Table element are as follows...\n";
    hashTable.display();

    return 0;
}

Output:


Explanation of Output:

As we see in above output some inputs are get stored in same bucket. As shown in above Bucket no 4 Three inputs are stored in same Bucket no 4 this happens just because of Hash Function we used. The hash function returns the same index for all those inputs, so they gets stored in same bucket.

let's see how hash function works:

S = 83 + o = 111 + n = 110 + a = 97 + l = 108 + i = 105    Total = 614  

when we find the reminder of above name ASCII Values sum will get 614%10 = 4

N = 78 + i = 105 + t = 116 + i = 105 + n = 110 Total = 514  

when we find the reminder of above name ASCII Values sum will get 514%10 = 4


Code 2:

2. A C++ implementation of a hash table using open addressing with linear probing for collision resolution. This implementation includes insert, search, and display operations.


#include <iostream>
#include <cstring>
using namespace std;

#define TABLE_SIZE 10
#define EMPTY "EMPTY"  // Placeholder for empty slots
#define DELETED "DELETED" // Placeholder for deleted slots

class HashTable {
private:
    string names[TABLE_SIZE];  // Array for names
    string phones[TABLE_SIZE]; // Array for phone numbers
    bool occupied[TABLE_SIZE]; // Tracks occupied slots

    // Hash function
    int hashFunction(const string& key) {
        int hash = 0;
        for (char c : key) {
            hash += c;
        }
        return hash % TABLE_SIZE;
    }

public:
    // Constructor
    HashTable() {
        for (int i = 0; i < TABLE_SIZE; ++i) {
            names[i] = EMPTY;
            phones[i] = EMPTY;
            occupied[i] = false;
        }
    }

    // Insert function
    void insert(const string& name, const string& phone) {
        int index = hashFunction(name);
        int start = index;
        while (names[index] != EMPTY && names[index] != DELETED) {
            index = (index + 1) % TABLE_SIZE;
            if (index == start) { // Table is full
                cout << "Error: Hash table is full. Cannot insert " << name << ".\n";
                return;
            }
        }
        names[index] = name;
        phones[index] = phone;
        occupied[index] = true;
    }

    // Search function
    string search(const string& name) {
        int index = hashFunction(name);
        int start = index;
        int comparison=0;
        while (names[index] != EMPTY) {
        comparison++;
            if (names[index] == name) {
            cout<<"\nTotal Comparisions take to search is:"<<comparison<<endl;
                return phones[index];
            }
            index = (index + 1) % TABLE_SIZE;
            if (index == start) { // Avoid infinite loops
                break;
            }
        }
        cout<<"\nTotal Comparisions take to search is:"<<comparison<<endl;
        return "Not Found";
    }

    // Display function
    void display() {
        cout << "Hash Table Contents:\n";
        for (int i = 0; i < TABLE_SIZE; ++i) {
            if (names[i] == EMPTY || names[i] == DELETED) {
                cout << "Bucket " << i << ": [EMPTY]\n";
            } else {
                cout << "Bucket " << i << ": [" << names[i] << ": " << phones[i] << "]\n";
            }
        }
    }
};

int main() {
    HashTable hashTable;

    int n;
    cout << "Enter the number of clients: ";
    cin >> n;

    // Insert data into the hash table
    for (int i = 0; i < n; ++i) {
        string name, phone;
        cout << "Enter name of client " << i + 1 << ": ";
        cin >> name;
        cout << "Enter phone number of client " << i + 1 << ": ";
        cin >> phone;
        hashTable.insert(name, phone);
    }

    // Display the hash table
    hashTable.display();

    // Search for a key
    string searchName;
    cout << "Enter the name to search for: ";
    cin >> searchName;
    string result = hashTable.search(searchName);
    if (result != "Not Found") {
        cout << "Phone number of " << searchName << ": " << result << endl;
    } else {
        cout << searchName << " not found in the telephone book.\n";
    }

    return 0;
}

Output:



Question 1: When to use chaining and when to use open addressing Method?

Ans: You use chaining when data is more or we can say when we expect a high load factor, one more thing when we are inserting data at that time memory is not a constraint that is we may use memory as we want means there is no restriction that only 550MB we have to use then in that case we may go with chaining also when we want frequent insertion and deletion operation in that case also we can go with chaining.

You use open addressing when there is no frequent insertion and deletion operation, when there is memory constraint, also when dataset is smaller, or we can say we keep load factor low in those cases we may go with open addressing method.




Note: For chaining we have used array, you can use linked list as a data structure.


Basics and need of Data Science and Big Data, Applications of Data Science

 




























Sunday, 1 December 2024

FPP Assignment No 14 and 15

Problem Statement:
Develop a program to create a DataFrame from a NumPy array with custom column names.


import numpy as np import pandas as pd # Create a NumPy array data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Define custom column names columns = ['Column A', 'Column B', 'Column C'] # Create a DataFrame df = pd.DataFrame(data, columns=columns) print("DataFrame created from NumPy array:") print(df)

2. Drawing a Bar Plot and Scatter Plot using Matplotlib


import matplotlib.pyplot as plt
# Data for plots categories = ['A', 'B', 'C', 'D'] values = [4, 7, 1, 8] x = [1, 2, 3, 4] y = [10, 20, 25, 30] # Bar Plot plt.figure(figsize=(8, 4)) plt.bar(categories, values, color='skyblue') plt.title("Bar Plot") plt.xlabel("Categories") plt.ylabel("Values") plt.show() # Scatter Plot plt.figure(figsize=(8, 4)) plt.scatter(x, y, color='red', label='Points') plt.title("Scatter Plot") plt.xlabel("X-axis") plt.ylabel("Y-axis") plt.legend() plt.show()

Explanation:

  1. DataFrame Creation:

    • The program uses np.array to create a data matrix.
    • Custom column names are passed to pd.DataFrame to create the DataFrame.
  2. Bar Plot:

    • A bar plot is drawn using plt.bar, with labels for categories and values.
  3. Scatter Plot:

    • A scatter plot is created using plt.scatter, with x and y as input points.

Let me know if you'd like to expand or modify these examples!

Friday, 29 November 2024

FPP Assignment No 13

 Problem Statement:  Create a basic CGI script in Python that allows students to submit their name and grade through a form and displays a personalized message based on their input?




Prerequisites

  1. Ensure you have a web server capable of running CGI scripts (e.g., Apache or Python's built-in HTTP server with CGI enabled).
  2. Save the script in a directory configured for CGI execution (e.g., cgi-bin).


Save this script as student_form.py in your CGI directory.

import cgi

import cgitb


# Enable debugging

cgitb.enable()


def main ():

    # Output HTTP headers

    print ("Content-Type: text/html\n")


    # Get form data

    form = cgi.FieldStorage()

    name = form.getvalue("name", ""). strip()

    grade = form.getvalue("grade", ""). strip ()


    # HTML Response

    print ("""

    <!DOCTYPE html>

    <html lang="en">

    <head>

        <meta charset="UTF-8">

        <meta name="viewport" content="width=device-width, initial-scale=1.0">

        <title>Student Grade Submission</title>

    </head>

    <body>

    """)

    

    if not name or not grade:

        print ("""

        <h1>Student Grade Submission</h1>

        <form method="post" action="student_form.py">

            <label for="name">Name:</label><br>

            <input type="text" id="name" name="name" required><br><br>

            <label for="grade">Grade:</label><br>

            <input type="text" id="grade" name="grade" required><br><br>

            <button type="submit">Submit</button>

        </form>

        """)

    else:

        # Generate personalized message

        try:

            grade = float(grade)

            if grade >= 90:

                message = "Excellent work!"

            elif grade >= 75:

                message = "Good job, keep improving!"

            elif grade >= 50:

                message = "You passed, but there's room for improvement."

            else:

                message = "Unfortunately, you did not pass. Better luck next time!"

        except ValueError:

            message = "Invalid grade input. Please enter a numeric grade."

        

        print (f"""

        <h1>Hello, {name}! </h1>

        <p>Your grade is: {grade}</p>

        <p>{message}</p>

        <a href="student_form.py">Submit another grade</a>

        """)

    

    print ("""

    </body>

    </html>

    """)


if __name__ == "__main__":

    main ()


FPP Assignment No 12 Write Python code that creates a simple graphical user interface (GUI) application.

Problem Statement: Write Python code that creates a simple graphical user interface (GUI) application. 


Program Implementation and Explanation:


import tkinter as tk

from tkinter import messagebox


def display_greeting():

    """Displays a greeting message."""

    name = name_entry.get()

    if name.strip():

        messagebox.showinfo("Greeting", f"Hello, {name}!")

    else:

        messagebox.showwarning("Input Error", "Please enter your name.")


# Create the main window

root = tk.Tk()

root.title("Greeting App")

root.geometry("300x200")  # Set window size


# Create and place widgets

greeting_label = tk.Label(root, text="Enter your name:", font=("Arial", 12))

greeting_label.pack(pady=10)


name_entry = tk.Entry(root, width=30, font=("Arial", 12))

name_entry.pack(pady=5)


greet_button = tk.Button(root, text="Greet Me!", font=("Arial", 12), command=display_greeting)

greet_button.pack(pady=10)


quit_button = tk.Button(root, text="Quit", font=("Arial", 12), command=root.quit)

quit_button.pack(pady=10)


# Start the application

root.mainloop()


------------------------------------------

Explanation:

1. tkinter(Library): It is the python standard Library.

2. root (Main Window): It represents the main GUI window which appears on the screen when we run the code.
geometry is the method to set the window size.
title is the method which sets the title of window.

3. Widgets: It is nothing but graphical component through which user can interact with the application.
Label: On main window if you want to display the text then we have to use Label widget.
Entry: If you want to supply input to the application the use this widget. accepting input from user.
Button: if you want button to perform some operation then use this widget.

4. Event Handling: When we click on button some event occurs immediately. that event is going to handle by following event handler.
command: through this parameter we bind the function that we want must happen after button click.
messagebox: It is the module used to display popup message.

5. root.mainloop(): This is the event loop; it runs continuously until and unless user does not close it.

FPP Assignment no 11 Backing up a given file.

Problem Statement: Develop a program to backing Up a given Folder (Folder in a current working directory) into a ZIP File by using relevant modules and suitable methods.


Program Implementation:


import os

import zipfile

from datetime import datetime


def backup_to_zip(folder):

    # Ensure the folder path is absolute

"""

An absolute path is a full path that specifies the location of a file or directory from the root directory (‘/’). It provides a complete address that points directly to a file or directory, regardless of the current working directory.

"""

    folder = os.path.abspath(folder)

    

    # Generate a unique filename for the ZIP file

    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

    zip_filename = f"{os.path.basename(folder)}_backup_{timestamp}.zip"

    

    print (f"Creating backup ZIP file: {zip_filename}")

    

    # Create the ZIP file

    with zipfile.ZipFile(zip_filename, 'w', zipfile.ZIP_DEFLATED) as backup_zip:

        # Walk through the folder and add files to the ZIP

        for foldername, subfolders, filenames in os.walk(folder):

            print (f"Adding folder: {foldername}")

            # Add the current folder to the ZIP

            backup_zip.write(foldername, os.path.relpath(foldername, folder))

 """

relative path specifies the location of a file or directory in relation to the current working directory (often abbreviated as pwd). It does not start with a slash (‘/’), and it utilizes navigational shortcuts to refer to the file or directory.

"""

            # Add all files in this folder to the ZIP

            for filename in filenames:

                file_path = os.path.join(foldername, filename)

                arcname = os.path.relpath(file_path, folder)

                print (f"Adding file: {file_path}")

                backup_zip.write(file_path, arcname)

    

    print (f"Backup completed successfully! Saved as {zip_filename}")


# Example Usage

if __name__ == "__main__":

    folder_to_backup = input ("Enter the folder name to back up: ")

    if os.path.exists(folder_to_backup) and os.path.isdir(folder_to_backup):

        backup_to_zip(folder_to_backup)

    else:

        print (f"The folder '{folder_to_backup}' does not exist or is not a directory.")

 

------------------------------------------------------------------------------------------------

Explanation:

Q. What is zipFile( ) ?

zipfile.ZipFile() is a Python class provided by the zipfile module. It allows you to create, read, write, and extract ZIP files. The zipfile module is part of Python's standard library, so you don't need to install additional packages to use it.

With the zipfile class you can open an existing zip file, or you can create a new zip file.

Here, is the syntax of how zip file is created.


zipfile.ZipFile(file, mode='r', compression=ZIP_STORED, allowZip64=True)


Parameters

  • file: The name of the ZIP file (str or file-like object). If you're creating a new ZIP file, provide the name here.
  • mode:
    • 'r': Read (default).
    • 'w': Write, creating a new ZIP file (overwrites if it exists).
    • 'x': Write, creating a new ZIP file (raises an error if it exists).
    • 'a': Append to an existing ZIP file or create a new one.
  • compression: The compression type. Possible values:
    • zipfile.ZIP_STORED: No compression (default).
    • zipfile.ZIP_DEFLATED: Compressed (requires zlib).
    • zipfile.ZIP_BZIP2: BZIP2 compression (requires bz2).
    • zipfile.ZIP_LZMA: LZMA compression (requires lzma).
  • allowZip64: Whether to enable ZIP64 extensions for large files (enabled by default).