PhD Chapter - Deal with xls data in matlab

(Comments)

The data in excel and how to handle them in Stata 

So here is a couple of suggestion on what we will learn in this page 

The menu

  1. Getting started with Stata
  2. Vectors and variables
  3. Command statements
  4. Import and Export data
    1. Save and load multiple files
    2. Import formatted text data
    3. Import excel format data
  5. Translate formula into Code
  6. Descriptive Statistics
    1. Measure and central tendency
    2. Variance and standard deviation
    3. Sort up and down
    4. Data transformation 
  7. 2D plotting
    1. Drawing the line
    2. Bars
    3. Dots
    4. Multi daata
    5. Unsolved images
    6. Histograrm
    7. Uncertainty in future money 
    8. Blend picture transparency 
  8. 3D plotting

Linear regression in Stata

The nice thing from Stata is, you don't need to regress with formula, just take a picture, and while you got the equation. 

4. Import and export data 

c. Working with excel 

First if you want to know what behind some function, please type help in Matlab Terminal

such as 

>> help xlsread 

or if you want to see them in the document format type

>> doc xlsread 

since the default of xlsread is numeric only, so if you want them to read all, it should be 

Request the numeric data, text data, and combined data from the Excel file in the first example.

[num,txt,raw] = xlsread('myExample.xlsx')
num =
     1     2     3
     4     5   NaN
     7     8     9

txt = 
    'First'    'Second'    'Third'
    ''         ''          ''     
    ''         ''          'x'    

raw = 
    'First'    'Second'    'Third'
    [    1]    [     2]    [    3]
    [    4]    [     5]    'x'    
    [    7]    [     8]    [    9]

Got it! 

%%
% COURSE: Master MATLAB through guided problem-solving
% SECTION: Importing and exporting data
% VIDEO: Import and export Excel-format data
% Instructor: mikexcohen.com
%
%%

% list data folder and file
dataFolder = '/Users/dimasmukhlas/Documents/MATLAB/Udemy/';
dataFile = 'sensordata.xlsx';

% import data file using xlsread

[numdata,txtdata, rawdata] = xlsread(dataFile);
% check sizes and outputs
whos


%% extract key data

% starting line
startline = find( strcmpi(rawdata(:,1),'Start data') );
% strcmpi compare the strring to find the value of startdata from all row
% in column 1 , the result will be boleaan
%function find basically found which one from all this row that true

% get list of sensor IDs
sensorID = cell2mat(rawdata(startline+1:end-1,2));


%get the rawdata from startline + 1 until the end -1 and take the column 2
% the function cell2mat is changing the text into numerical vector

% get list of time points
timepoints = cell2mat(rawdata(startline+1:end-1,4));

% now get all of the data
datatemp = cell2mat(rawdata(startline+1:end-1,end));

%check how the data looks like in graph
plot(sensorID,'o');
plot(timepoints,'o');
plot(datatemp,'o');

% a list of unique sensor numbers would also be useful
uchans = unique(sensorID);
utimes = unique(timepoints);

% initialize data matrix
datamat = nan( length(uchans),length(utimes) );

%% populate and plot

% populate one line at a time
for linei=1:length(sensorID)

% line-specific channel and time point
datamat(sensorID(linei),timepoints(linei)) = datatemp(linei);
end

% plot
figure(1), clf
plot(datamat','s-','markerfacecolor','w')


%% bonus: identify missing data

% find where datamat is nan
% isnaan is the function to find Nan data
missingdata = find(isnan(datamat));


% loop over all missing time points
for i=1:length(missingdata)

% convert index to subscript to find channel/time point
[missChan,missTime]=ind2sub(size(datamat),missingdata(i));

% print message
warning([ 'Channel ' num2str(missChan) ' timepoint ' num2str(missTime) ' has a missing value!' ])
end

%%

6. Descriptive Statistic 

Descriptive statistic meaning we want to get the full picture of what the set of data looks like, about their average, their modus, median etc. 

a. Compute measures of central tendency

The need to find how the data looks like before we move to some statistical or even econometric equation is very important. 

%%
% COURSE: Master MATLAB through guided problem-solving
% SECTION: Descriptive statistics
% VIDEO: Compute measures of central tendency
% Instructor: mikexcohen.com
%
%%

% dataset to work with
% round means finding the nearest integer
% exp is the exponential, such as exponential 1 is 2,...
% randn is normally distributed random variable
% (101,1) mean to create 101 line of row in 1 column


data = round( exp(2+randn(101,1)/2) );

% always important to look at data!
% how to data looks like

figure(1), clf
histogram(data,20)

%% compute the mean

% algorithm here
% numel is the number of element
n = numel(data);
themean = sum(data)/n;

% compare with MATLAB's mean function
themean2 = mean(data);

%% compute the median

% now see how the data when its not sorted
plot(data);
% sort the data points
datasort = sort(data);
plot(datasort);
% if you want to descend
datasortdescend = sort(data,"descend");
plot(datasortdescend);

% find the middle value
% ceil means ceiling or goes up from the result, check also
% fix, floor,
themedian = datasort(ceil(n/2));

% compare with built-in function
themedian2 = median (data);

%% compute the mode

% find the unique data values
uniquevals = unique(data);

% loop through values and count the number of numbers with each value
% size is showing the size matrix of the unique value which is 21x1
% zeros is making a zero out of all, so 21 row of zero
numnums = zeros(size(uniquevals));


% the for loop here
% ui set into 1
% length is Length of largest array dimension, in this case 21
for ui=1:length(numnums)
% here is the counting

numnums(ui) = sum(data ==uniquevals(ui));
%fill the value of numnums with the sum of data unique value
end

% find the maximum count
[dontcare,maxidx] = max (numnums);

% the mode is that value
themode = uniquevals(maxidx);

% compare with MATLAB function
themode2 = moode(data);

%% bonus

hold on
plot([1 1]*themean,get(gca,'ylim'),'r--','LineWidth',5)
plot([1 1]*themedian,get(gca,'ylim'),'b--','LineWidth',5)

plot([1 1]*themode,get(gca,'ylim'),'k--','LineWidth',5)
legend({'Data';'mean';'median';'mode'})
xlabel('Value'), ylabel('Count')

%%

b. Compute the variance and Standard deviation

c. Unsolved: Sort data up and down

d. Data transformation (log, sqrt, rank)

7. Drawing the 2D 

a. Drawing the line

%%
% COURSE: Master MATLAB through guided problem-solving
% SECTION: 2D plotting
% VIDEO: Lines
% Instructor: mikexcohen.com
%
%%

%% very brief intro to plotting lines

% start with a clean figure
figure(1), clf

% note: MATLAB draws down the columns;
% this code produces 4 10-point lines
plot(randn(10,4))
%it will give 10 axis from 0 to 10, and each axis will have random and 4
%lines
%get the current axis (gca), which is from 0 to 10 axis
hold on
plot(get(gca,'xlim'),[0 0],'k','linewidth',3)

%% curve from straight lines

% number of lines (resolution)
n = 100;

% lets make this disco color
% clear figure and hold on
figure(2), clf
hold on

% loop over lines
for i=1:n

plot([i n],[0 i],'w')
% maksudnya untuk menggambar garis dibutuhka dua titik koordinat
% nah x yang i berinteraksi dengan y 0
% dan x yang n berinteraksi dengan y i
plot([i n],[i 0], 'm')
% the top just the inverse from x and y
plot([0 i],[i n],'g')

end

% give the plot that 1980's video game look
% gcf meanig get current figure like all the color to be black
% and axis of x and y become gone
set(gcf,'color','k')
axis off

%%

e. Uncertainty in Future Money

So this is thee case

Currently unrated

Comments

Riddles

22nd Jul- 2020, by: Editor in Chief
524 Shares 4 Comments
Generic placeholder image
20 Oct- 2019, by: Editor in Chief
524 Shares 4 Comments
Generic placeholder image
20Aug- 2019, by: Editor in Chief
524 Shares 4 Comments
10Aug- 2019, by: Editor in Chief
424 Shares 4 Comments
Generic placeholder image
10Aug- 2015, by: Editor in Chief
424 Shares 4 Comments

More News  »

How to create output gap with Python and Anaconda

Recent news
1 month, 2 weeks ago

Dignity wrapped in Charity

Recent news
3 months ago

A reflection of using kanban flow and being minimalist

Recent news

Today is the consecutive day I want to use and be consistent with the Kanban flow! It seems it's perfect to limit my parallel and easily distractedness. 

read more
3 months, 1 week ago

Morning issue with car and my kind of music

Recent news
3 months, 1 week ago

Podcast Bapak Dimas 2 - pindahan rumah

Recent news

Vlog kali ini adalah terkait pindahan rumah!

read more
3 months, 2 weeks ago

Podcast Bapak Dimas - Bapaknya Jozio dan Kaziu - ep 1

Recent news

Seperti yang saya cerita kan sebelumnya, berikut adalah catatan pribadi VLOG kita! Bapak Dimas

read more
3 months, 2 weeks ago

Happy new year 2024 and thank you 2023!

Recent news

As the new year starts, I want to revisit what has happened in 2023. 

read more
3 months, 2 weeks ago

Some notes about python and Zen of Python

Recent news

Explore Python syntax

Python is a flexible programming language used in a wide range of fields, including software development, machine learning, and data analysis. Python is one of the most popular programming languages for data professionals, so getting familiar with its fundamental syntax and semantics will be useful for your future career. In this reading, you will learn about Python’s syntax and semantics, as well as where to find resources to further your learning.

read more
4 months, 4 weeks ago

More News »

Generic placeholder image

Collaboratively administrate empowered markets via plug-and-play networks. Dynamically procrastinate B2C users after installed base benefits. Dramatically visualize customer directed convergence without