tag:blogger.com,1999:blog-63148760082919425312024-02-24T23:29:49.809-08:00Antonio Gulli's coding playgroundRandom commentary about Machine Learning, BigData, Spark, Deep Learning, C++, STL, Boost, Perl, Python, Algorithms, Problem Solving and Web SearchUnknownnoreply@blogger.comBlogger1769125tag:blogger.com,1999:blog-6314876008291942531.post-60674449824860213122016-05-08T11:50:00.003-07:002016-05-08T11:50:42.605-07:00Demystifying deep learning: explaining a LeNet like codeIn a previous post we introduced the LeNet like convnet. Now let's discuss the code.
Keras implements both Convolutional and Maxpooling modules, together with l1 and l2 regularizers and with several optimizer methods such as Stochastic Gradient Descent, Adam and RMSprop. In the following code print_Graph is an utility function used to print the results of different experiments when we change the hyper-parameters. Our convnet is defined by convNet_LeNet, a function accepting multiple input parameters.<br />
<br />
Parameters<br />
<br />
<ul>
<li>NORMALIZE: Whether or not the MNIST should be divided by 255, which is the max value for a pixel. </li>
<li>BATCH_SIZE: The mini batch size used for the model. The idea is that an initial BATCH_SIZE examples are considered for training the network. Then, the weights are updated and the next BATCH_SIZE examples are considered </li>
<li>NUM_EPOCHS: The number of training epochs used during the experiments </li>
<li>NUM_FILTERS: The number of convolutional filters / feature maps applied </li>
<li>NUM_POOL: The side length for max pooling operations </li>
<li>NUM_CONV: The side length for convolution operations </li>
<li>DROPOUT_RATE: The dropout rate. It acts as a form of regularizations by introducing stochastic connections in the network </li>
<li>NUM_HIDDEN: Number of hidden neurons used in the dense network after applying convolutional and maxpooling operations </li>
<li>VALIDATION_SPLIT: The percentage of training data used as validation data </li>
<li>OPTIMIZER: One among SGD, Adam and RMSprop </li>
<li>REGULARIZER: Either L1, or L2, or L1+L2 (ElasticNet) </li>
</ul>
<br />
At the beginning the 60.000 MNIST images are loaded. The training set contains 60000 examples, and the test set 10000 examples. Then, the data is converted into float32 the only format allowed by GPU computations. Optionally, the data is normalized. The true labels for training and test are converted from the original [0-9] set into One Hot Encoding (OHE) representation a prerequisite for the following classification.<br />
<br />
Then, we implement the proper LetNet architecture. An initial set of convolutional operation of size NUM_CONV x NUM_CONV is applied. It produces NUM_FILTER outputs and uses a rectified linear activation function. The input of this layer is passed through a similar convolution layer and a subsequent maxpool layer with size NUM_POOL x NUM_POOL. For avoiding overfitting, an additional module drops out some connections with rate DROPOUT_RATE. This initial block is then repeated with a second identical block. Notice that Keras automatically computes the dimension of the data moving and transformed across different blocks.<br />
<br />
After the proper convent layers two dense layers have been introduced. The first one is a layer with N_HIDDEN neurons, and the second one has NUM_CLASSES outputs which are aggregated into a single neuron with softmax activation.
You might wonder why we are adopting this particular architecture and not another one perhaps simpler or more complex? Well, indeed the way in which convnet and maxpool operations are composed depends a lot on the specific domain and there are not necessarily theoretical motivations explaining the optimal composition. The suggestion is to start with something very simple then check the achieved performance and then iterate by adding more layers until gains are observed and the cost of execution is not increasing too much.<br />
<br />
I know it seems a kind of magic, but the important aspect to understand is that even a relatively simple network like this one outperforms traditional machine learning techniques.
The model is then compiled by using categorical_crossentropy as loss function and accuracy as metric. Besides that, an early stop criterion is also adopted.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-4751487899026864412016-05-07T13:24:00.002-07:002016-05-07T13:38:30.813-07:00Demystifying deep learning series: hands on experimental sessions with Convnets This is the first of a series of hands on series where I'll explain deep learning step-by-step and with a lot of experimental results. Let's start from a classical but hard enough problem: recognizing hand written numbers.<br />
<br />
How many times have you thought is that a 4 or a 9 when your best mate wrote a number on a piece of paper? Well, if that's hard for humans how possibly could it be simpler for a computer to learn. Welcome in the kingdom of deep learning where certain tasks can be taught to computer with super-humans capacity. And when I say "taught" I mean it. Here, we don't code algorithms for solving problems. No, here we code algorithms for learning how to solve a problem. Then, we take a bunch of examples and the computer will learn from them. Kinda of cool, no?<br />
<br />
So let's start.<br />
<br />
First, we need a dataset with handwritten characters and luckily we have one handy. That's MNIST (<a href="http://yann.lecun.com/exdb/mnist/">http://yann.lecun.com/exdb/mnist/</a>) which is produced by <a href="http://yann.lecun.com/">Yan LeCun</a> the guru of deep learning, currently at Facebook. He invented something known as ConvNets which broke any previous result in learning in so many different application domains. I think he will get the Turing Award one day. Convnets are simple and effective as we will see in follow up posting.<br />
<br />
Second, we need some high level library for coding deep-learning in a simple and effective way. Here we are super-lucky because in the last year there has been a Cambrian explosion of deep learning libraries with all the big players giving a contribution from Google, to Facebook, to Microsoft, to the Academic world. After testing many (Theano, Google's Tensorflow, Lasagne, Block, Neon) I decided to go for <a href="http://keras.io/">Keras</a> because it is clean and minimalist. Plus it runs on the top of Theano and TensorFlow which are the state of the art today and you can switch the backend transparently. Keras supports both CPUs and GPUs computation.<br />
<br />
Third, let's show directly some code which I wrote and can get to an accuracy of >98%<br />
<pre class="brush: python">
import numpy as np
import matplotlib.pyplot as plt
import time
np.random.seed(1111) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras.regularizers import l2, activity_l2
from keras.utils.visualize_util import plot
from keras.optimizers import SGD, Adam, RMSprop
from keras.callbacks import EarlyStopping
import inspect
#
# save the graph produced by the experiment
#
def print_Graph(
# Training log
fitlog,
# elapsed time
elapsed,
# input parameters for the experiment
args,
# input values for the experiment
values):
experiment_label = "\n".join(['%s=%s' % (i, values[i]) for i in args])
experiment_file = experiment_label+"-Time= %02d" % elapsed + "sec"
experiment_file = experiment_file.replace("\n", "-")+'.png'
fig = plt.figure(figsize=(6, 3))
plt.plot(fitlog.history["val_acc"])
plt.title('val_accuracy')
plt.ylabel('val_accuracy')
plt.xlabel('iteration')
fig.text(.7,.15,experiment_label, size='6')
plt.savefig(experiment_file, format="png")
#
# A LeNet-like convnet for classifying MINST handwritten characters 28x28
#
def convNet_LeNet(
VERBOSE=1,
# normlize
NORMALIZE = True,
# Network Parameters
BATCH_SIZE = 128,
NUM_EPOCHS = 20,
# Number of convolutional filters
NUM_FILTERS = 32,
# side length of maxpooling square
NUM_POOL = 2,
# side length of convolution square
NUM_CONV = 3,
# dropout rate for regularization
DROPOUT_RATE = 0.5,
# hidden number of neurons first layer
NUM_HIDDEN = 128,
# validation data
VALIDATION_SPLIT=0.2, # 20%
# optimizer used
OPTIMIZER = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
):
# Output classes, number of MINST DIGITS
NUM_CLASSES = 10
# Shape of an MINST digit image
SHAPE_X, SHAPE_Y = 28, 28
# Channels on MINST
IMG_CHANNELS = 1
# LOAD the MINST DATA split in training and test data
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, SHAPE_X, SHAPE_Y)
X_test = X_test.reshape(X_test.shape[0], 1, SHAPE_X, SHAPE_Y)
# convert in float32 representation for GPU computation
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
if (NORMALIZE):
# NORMALIZE each pixerl by dividing by max_value=255
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# KERAS needs to represent each output class into OHE representation
Y_train = np_utils.to_categorical(Y_train, NUM_CLASSES)
Y_test = np_utils.to_categorical(Y_test, NUM_CLASSES)
nn = Sequential()
#FIRST LAYER OF CONVNETS, POOLING, DROPOUT
# apply a NUM_CONV x NUM_CONF convolution with NUM_FILTERS output
# for the first layer it is also required to define the input shape
# activation function is rectified linear
nn.add(Convolution2D(NUM_FILTERS, NUM_CONV, NUM_CONV,
input_shape=(IMG_CHANNELS, SHAPE_X, SHAPE_Y) ))
nn.add(Activation('relu'))
nn.add(Convolution2D(NUM_FILTERS, NUM_CONV, NUM_CONV))
nn.add(Activation('relu'))
nn.add(MaxPooling2D(pool_size = (NUM_POOL, NUM_POOL)))
nn.add(Dropout(DROPOUT_RATE))
#SECOND LAYER OF CONVNETS, POOLING, DROPOUT
# apply a NUM_CONV x NUM_CONF convolution with NUM_FILTERS output
nn.add(Convolution2D( NUM_FILTERS, NUM_CONV, NUM_CONV))
nn.add(Activation('relu'))
nn.add(Convolution2D(NUM_FILTERS, NUM_CONV, NUM_CONV))
nn.add(Activation('relu'))
nn.add(MaxPooling2D(pool_size = (NUM_POOL, NUM_POOL) ))
nn.add(Dropout(DROPOUT_RATE))
# FLATTEN the shape for dense connections
nn.add(Flatten())
# FIRST HIDDEN LAYER OF DENSE NETWORK
nn.add(Dense(NUM_HIDDEN))
nn.add(Activation('relu'))
nn.add(Dropout(DROPOUT_RATE))
# OUTFUT LAYER with NUM_CLASSES OUTPUTS
# ACTIVATION IS SOFTMAX, REGULARIZATION IS L2
nn.add(Dense(NUM_CLASSES, W_regularizer=l2(0.01) ))
nn.add(Activation('softmax') )
#summary
nn.summary()
#plot the model
plot(nn)
# set an early-stopping value
early_stopping = EarlyStopping(monitor='val_loss', patience=2)
# COMPILE THE MODEL
# loss_function is categorical_crossentropy
# optimizer is parametric
nn.compile(loss='categorical_crossentropy',
optimizer=OPTIMIZER, metrics=["accuracy"])
start = time.time()
# FIT THE MODEL WITH VALIDATION DATA
fitlog = nn.fit(X_train, Y_train, \
batch_size=BATCH_SIZE, nb_epoch=NUM_EPOCHS, \
verbose=VERBOSE, validation_split=VALIDATION_SPLIT, \
callbacks=[early_stopping])
elapsed = time.time() - start
# Test the network
results = nn.evaluate(X_test, Y_test, verbose=VERBOSE)
print('accuracy:', results[1])
# just to get the list of input parameters and their value
frame = inspect.currentframe()
args, _, _, values = inspect.getargvalues(frame)
# used for printing pretty arguments
print_Graph(fitlog, elapsed, args, values)
return fitlog
# 2 epochs
#log = convNet_LeNet(OPTIMIZER = 'Adam', NUM_EPOCHS=2)
#print(log.history)
# 20 epochs
#log = convNet_LeNet(OPTIMIZER = 'Adam', NUM_EPOCHS=20)
#print(log.history)
# default optimizer = SGD
#log = convNet_LeNet(NUM_EPOCHS=20)
#print(log.history)
# default optimizer = RMSProp
#log = convNet_LeNet(OPTIMIZER=RMSprop(), NUM_EPOCHS=20)
#print(log.history)
## default optimizer
#log = convNet_LeNet(OPTIMIZER='Adam', DROPOUT_RATE=0)
#print(log.history)
# default optimizer
#log = convNet_LeNet(OPTIMIZER='Adam', DROPOUT_RATE=0.1)
#print(log.history)
# default optimizer
#log = convNet_LeNet(OPTIMIZER='Adam', DROPOUT_RATE=0.2)
#print(log.history)
# default optimizer
#log = convNet_LeNet(OPTIMIZER='Adam', DROPOUT_RATE=0.4)
#print(log.history)
# default optimizer
#log = convNet_LeNet(OPTIMIZER='Adam', BATCH_SIZE=64)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', BATCH_SIZE=128)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', BATCH_SIZE=256)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', BATCH_SIZE=512)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', BATCH_SIZE=1024)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', BATCH_SIZE=2048)
#print(log.history)
#
#log = convNet_LeNet(OPTIMIZER='Adam', BATCH_SIZE=4096)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', VALIDATION_SPLIT=0.8)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', VALIDATION_SPLIT=0.6)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', VALIDATION_SPLIT=0.4)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', VALIDATION_SPLIT=0.2)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', VALIDATION_SPLIT=0.2, NORMALIZE=False)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', VALIDATION_SPLIT=0.2, NUM_FILTERS=64)
#print(log.history)
log = convNet_LeNet(OPTIMIZER='Adam', NUM_FILTERS=128)
print(log.history)
# log = convNet_LeNet(OPTIMIZER='Adam', NUM_FILTERS=256)
# print(log.history)
# x log = convNet_LeNet(OPTIMIZER='Adam', NUM_POOL=4)
# x print(log.history)
# log = convNet_LeNet(OPTIMIZER='Adam', NUM_POOL=8)
# print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', NUM_CONV=4)
#print(log.history)
# x log = convNet_LeNet(OPTIMIZER='Adam', NUM_CONV=8)
# x print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', NUM_HIDDEN=32)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', NUM_HIDDEN=64)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', NUM_HIDDEN=256)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', NUM_HIDDEN=512)
#print(log.history)
#log = convNet_LeNet(OPTIMIZER='Adam', NUM_HIDDEN=1024)
#print(log.history)
# VERBOSE=1,
# # normlize
# NORMALIZE = True,
# # Network Parameters
# BATCH_SIZE = 128,
# NUM_EPOCHS = 100,
# # Number of convolutional filters
# NUM_FILTERS = 32,
# # side length of maxpooling square
# NUM_POOL = 2,
# # side length of convolution square
# NUM_CONV = 3,
# # dropout rate for regularization
# DROPOUT_RATE = 0.5,
# # hidden number of neurons first layer
# N_HIDDEN = 128,
# # validation data
# VALIDATION_SPLIT=0.2, # 20%
# # optimizer used
# OPTIMIZER = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
#plt.show()
</pre>
<br />
Next posting is about describing the code. Then, you will see dozens of experiments for exploring the hyper-parameters' space and inferring some rules of thumbs for fine tuning our deep learning nets.<br />
<br />
<br />
Stay tuned, during the next months we will see more than 20 nets for deep learning in different contexts and show super-human capacityUnknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-75388719938467374032015-12-08T00:52:00.000-08:002015-12-12T00:52:36.542-08:003rd and 7th on amazon<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbAgvn6Ya83aBBa6_bJpyVQMf1c0nvQ00Nx5s34cv7exWFtkT4CNjCVp1jmuE1XprZkAzcOU324rF8rqNj5i8jHlsGLTqQmGLP8_xhdDTGW_n98yHiP33w4V75Ro8-1gHG-mq339Uc483T/s1600/spark+python.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="306" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbAgvn6Ya83aBBa6_bJpyVQMf1c0nvQ00Nx5s34cv7exWFtkT4CNjCVp1jmuE1XprZkAzcOU324rF8rqNj5i8jHlsGLTqQmGLP8_xhdDTGW_n98yHiP33w4V75Ro8-1gHG-mq339Uc483T/s640/spark+python.png" width="640" /></a></div>
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-11810900535615527412015-12-03T23:20:00.000-08:002015-12-07T23:22:46.061-08:00Special Edition Programming Interview Questions Solved in C++<h1 class="a-size-large a-spacing-none" id="title" style="background-color: white; box-sizing: border-box; color: #111111; font-family: Arial, sans-serif; font-size: 21px !important; line-height: 1.3 !important; margin-bottom: 0px !important; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding: 0px; text-rendering: optimizeLegibility;">
<span class="a-size-large" id="productTitle" style="box-sizing: border-box; line-height: 1.3 !important; text-rendering: optimizeLegibility;"><a href="http://www.amazon.com/Special-Programming-Interview-Questions-Solved/dp/1519327544">Tree, Graph, Bit, Dynamic Programming, and Design Patterns (Special Collections on Programming) (Volume 1)</a></span> </h1>
<div>
<br /></div>
<div>
A collection of more than 150 interview questions in C++. Useful to nail your next job interview</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://ecx.images-amazon.com/images/I/71uDAK5N5cL.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://ecx.images-amazon.com/images/I/71uDAK5N5cL.jpg" height="640" width="426" /></a></div>
<div>
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-32975991128540466672015-12-02T23:17:00.000-08:002015-12-07T23:23:20.613-08:00A collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark (II): (Volume 7)<h1 class="a-size-large a-spacing-none" id="title" style="background-color: white; box-sizing: border-box; color: #111111; font-family: Arial, sans-serif; font-size: 21px !important; line-height: 1.3 !important; margin-bottom: 0px !important; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding: 0px; text-rendering: optimizeLegibility;">
<span class="a-size-large" id="productTitle" style="box-sizing: border-box; line-height: 1.3 !important; text-rendering: optimizeLegibility;"><a href="http://www.amazon.com/collection-Advanced-Learning-Interview-Questions/dp/1518678645/">Hands-on Big Data and Machine ... Programming Interview Questions) (Volume 7)</a></span></h1>
<div>
<br /></div>
<div>
Advanced Machine Learning and handson examples on spark and python</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://ecx.images-amazon.com/images/I/5131LwDI3IL.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://ecx.images-amazon.com/images/I/5131LwDI3IL.jpg" height="640" width="426" /></a></div>
<div>
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-27325181842206129732015-12-01T23:15:00.000-08:002015-12-07T23:23:56.763-08:00A collection of Data Science Interview Questions Solved in Python and Spark: Volume 6)<h1 class="a-size-large a-spacing-none" id="title" style="background-color: white; box-sizing: border-box; color: #111111; font-family: Arial, sans-serif; font-size: 21px !important; line-height: 1.3 !important; margin-bottom: 0px !important; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding: 0px; text-rendering: optimizeLegibility;">
<span class="a-size-large" id="productTitle" style="box-sizing: border-box; line-height: 1.3 !important; text-rendering: optimizeLegibility;"><a href="http://www.amazon.com/collection-Science-Interview-Questions-Solved/dp/1517216710">Hands-on Big Data and Machine Learning (Volume 6)</a></span></h1>
<div>
<span class="a-size-large" style="box-sizing: border-box; line-height: 1.3 !important; text-rendering: optimizeLegibility;"><br /></span></div>
<div>
<span style="line-height: 20.8px;">A new book for Machine Learning and Data Mining with practical hands on examples in Spark and Python</span></div>
<div>
<span style="line-height: 20.8px;"><br /></span></div>
<div>
<span style="line-height: 20.8px;"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://ecx.images-amazon.com/images/I/510sI2SONpL.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://ecx.images-amazon.com/images/I/510sI2SONpL.jpg" height="640" width="425" /></a></div>
<div>
<span style="line-height: 20.8px;"><br /></span></div>
<div>
<span style="line-height: 20.8px;"><br /></span></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-48734487917892572682015-11-22T22:38:00.001-08:002015-11-22T22:39:58.900-08:00Elsevier: Machine Learning Content Discoverability<a href="http://www.slideshare.net/antoniogulli/2015-machine-learning-elsevier-demos">http://www.slideshare.net/antoniogulli/2015-machine-learning-elsevier-demos</a><br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7kqNNEnNj8YMbligI4T8egTOuMnHlU02PGkp4EcKlDubuno_YOfaVUn3Ad8WaL7A4JyHe3HMCJ1aDBkXhZUBxkeNUWv6SFxiUB76_lVExU3-3DWJVCZXYmdLLbCDG0zRIs5Io3BGK3Kyx/s1600/2015-machine-learning-elsevier-demos-1-638%255B1%255D.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7kqNNEnNj8YMbligI4T8egTOuMnHlU02PGkp4EcKlDubuno_YOfaVUn3Ad8WaL7A4JyHe3HMCJ1aDBkXhZUBxkeNUWv6SFxiUB76_lVExU3-3DWJVCZXYmdLLbCDG0zRIs5Io3BGK3Kyx/s640/2015-machine-learning-elsevier-demos-1-638%255B1%255D.jpg" width="640" /></a></div>
<br />
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-20153491164465556812015-11-17T00:34:00.001-08:002015-11-17T00:34:29.843-08:00Something cool made with our APIVery cool<br />
<br />
<a href="http://blog.sciencedirect.com/posts/reach-for-the-stars-how-one-developer-uses-sciencedirect-apis-to-achieve-more-for-nasa">blog.sciencedirect.com/posts/reach-for-the-stars-how-one-developer-uses-sciencedirect-apis-to-achieve-more-for-nasa</a><br />
<br />
<br />
For 20 years, the Smithsonian/NASA Astrophysics Data System (ADS) has kept all professional astronomers worldwide up-to-date via their digital library of 12 million records which provides links to ScienceDirect and other platforms for full-text retrieval. The ADS maintains relationships with all major publishers and offers users access to four million full-text article links with some of those links originating in 40 full-text Elsevier journals on ScienceDirect.<br />
In order to increase visibility of - and encourage linking to – their subscribed full-text (especially articles written by NASA researchers), NASA had the idea to add thumbnails of graphics appearing within the article to the abstract view of a publication. To do this, they turned to the ScienceDirect Object retrieval and Object search APIs to mine the images and then linked them to the corresponding articles on ScienceDirect. Until now, the ADS has been able to implement this feature for 32,000 publications.<br />
<h4>
A view of the ADS abstract page<br /><a href="https://ui.adsabs.harvard.edu/"><div class="media-thumbnail-frame">
<img alt="ADS abstract page" class="media-image" height="223" src="http://blog.sciencedirect.com/sites/g/files/g1381516/f/styles/large/public/201511/fig1.png?itok=cz1N6i0m" style="height: 223px; width: 480px;" typeof="foaf:Image" width="480" /></div>
</a></h4>
<h4>
A view of of the ADS graphics page with thumbnails linking to the full-text of the article<br /><a href="https://ui.adsabs.harvard.edu/"><div class="media-thumbnail-frame">
<img alt="ADS thumbnails page" class="media-image" height="257" src="http://blog.sciencedirect.com/sites/g/files/g1381516/f/styles/large/public/201511/fig2.png?itok=u2EoV51V" style="height: 257px; width: 480px;" typeof="foaf:Image" width="480" /></div>
</a></h4>
<br />
<blockquote>
<strong>“My experience with the ScienceDirect API was exemplary. A well-designed API with a very efficient and friendly support team to back it up!”</strong><br />
<em>- Edwin Henneken, IT Specialist for the Smithsonian/NASA Astrophysics Data System, employed at the Smithsonian Astrophysical Observatory in Cambridge, Massachusetts.</em></blockquote>
<br />The <a href="https://ui.adsabs.harvard.edu/">redesigned ADS</a> remains in beta release and can be easily accessed while more infornation about the <a href="https://en.wikipedia.org/wiki/Astrophysics_Data_System">ADS in general</a>, is also available.<br />
<h4>
Example of ScienceDirect article page with images<br /><a href="http://www.sciencedirect.com/"><div class="media-thumbnail-frame">
<img alt="ScienceDirect homepage" class="media-image" height="332" src="http://blog.sciencedirect.com/sites/g/files/g1381516/f/styles/large/public/201511/fig3.png?itok=oG3T8Mta" style="height: 332px; width: 480px;" typeof="foaf:Image" width="480" /></div>
</a></h4>
ScienceDirect APIs are designed to help developers retrieve and integrate full-text content from publications on ScienceDirect into their websites or applications. Visit the <a href="https://www.elsevier.com/solutions/sciencedirect/support/api">ScienceDirect API page</a> to learn more, watch videos and get started.<br />
<a href="https://www.youtube.com/watch?v=I3cjbB38Z4A&feature=youtu.be"></a><br />
<div class="media-thumbnail-frame">
<a href="https://www.youtube.com/watch?v=I3cjbB38Z4A&feature=youtu.be"><img alt="API text mining videos" class="media-image" height="120" src="http://blog.sciencedirect.com/sites/g/files/g1381516/f/styles/large/public/201511/text_mining_0.png?itok=nZmqrMqu" style="height: 120px; width: 480px;" typeof="foaf:Image" width="480" /></a></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-60357091367824215962015-11-13T05:02:00.000-08:002015-11-09T05:03:04.075-08:00Boson HiggsAccording wikipedia "<b>On 4 July 2012, </b>the discovery of a new particle with a mass between 125 an was announced; physicists suspected that it was the Higgs boson"
<br />
<br />
However, scholar returns results <b>from 1960 and 1990, which is 22 years </b>before the Scientific discovery. One result is from Elsevier<br />
.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOv5ejikXxt-RKvhSL_Zro3Yy9g1Y61gpqUmFMFDHNGUf-Uitn1XcJI6nsL2RQdGOceVd07TGfF_WFxvVrIV8aGJlr68U3M6pMKN58XtNqme7aLMz4_Ch0ZFDSpEPaHIsMS9b7chJ85t0p/s1600/boson+scholar.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="410" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOv5ejikXxt-RKvhSL_Zro3Yy9g1Y61gpqUmFMFDHNGUf-Uitn1XcJI6nsL2RQdGOceVd07TGfF_WFxvVrIV8aGJlr68U3M6pMKN58XtNqme7aLMz4_Ch0ZFDSpEPaHIsMS9b7chJ85t0p/s640/boson+scholar.png" width="640" /></a></div>
<br />
ScienceDirect returns fresher and relevant results,<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhs6jfmEk1oa8qAyRD5jbYb7aDpKilrAfSskOwShrs0kvqpAiFRTAtCJ-fHu7_SYeJEQdzzL_2wwHZ9BOPzYGAJVGwFFRyJydC22zR4Iiyu0foCY6DcOxBA41Jgo5ntkky3artNny-XPqFO/s1600/boson+sc.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="530" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhs6jfmEk1oa8qAyRD5jbYb7aDpKilrAfSskOwShrs0kvqpAiFRTAtCJ-fHu7_SYeJEQdzzL_2wwHZ9BOPzYGAJVGwFFRyJydC22zR4Iiyu0foCY6DcOxBA41Jgo5ntkky3artNny-XPqFO/s640/boson+sc.jpg" width="640" /></a></div>
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-63301511927411563912015-11-12T04:08:00.000-08:002015-11-10T12:41:11.254-08:00Instantaneous Recommendation: real time suggestions for your Academic Library One of my most favorite features shipped during the last round is a form of instantaneous recommendations. This feature suggests in real time new relevant papers as soon as my library is updated.<br />
<br />
So suppose that I add a few papers about deep learning to my library and that this is the first time I have papers about this research topic in my library.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGGsrVbtKNt91EEvsnr-_4cKKCPWvomXC1W4pCQBMr95Sn73EiiPG-vIZGr__vxmBvcAS1nJznZjvp351d0j-bEkvNCbAkM5jmM8eP1UIJbfLpknEh4gUZ8MAmg7vL2r5GCX4S4dYlwHNC/s1600/addsuggest.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGGsrVbtKNt91EEvsnr-_4cKKCPWvomXC1W4pCQBMr95Sn73EiiPG-vIZGr__vxmBvcAS1nJznZjvp351d0j-bEkvNCbAkM5jmM8eP1UIJbfLpknEh4gUZ8MAmg7vL2r5GCX4S4dYlwHNC/s640/addsuggest.jpg" width="640" /></a></div>
<br />
The suggestions are immediately updated and I see papers about Deep Neural Networks for speech recognition, Convolutional Networks, and LCVSR<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgecgppa_5K9ceQ1C_lY8jT2P94Kd7_oGFNntODwwIQrVMvTGckokxSxFw5VzOo_4WImhwv0exI6qkHV4-zuKubCO0hWofh1ubYCKN_px7jiic3CQofMjqgi9TSgMvCZq_0SkwIMUHhhJw5/s1600/suggest1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="262" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgecgppa_5K9ceQ1C_lY8jT2P94Kd7_oGFNntODwwIQrVMvTGckokxSxFw5VzOo_4WImhwv0exI6qkHV4-zuKubCO0hWofh1ubYCKN_px7jiic3CQofMjqgi9TSgMvCZq_0SkwIMUHhhJw5/s640/suggest1.jpg" width="640" /></a></div>
<br />
and relevant papers published by Yann LeCun<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1WHM2_4FD8VALDLmS1Q0-UmJDZejRMuxTyzrdTssrbpqCBhFHJtb2H2Nys5-JLLjtsiOkPSblcmVdSlPmTNcI-rE8RfcPuPq0FFFdZVVR1YO3Uxds4C8XZfa8lLCJO2TGA4_3Vbr_31tL/s1600/suggest2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="246" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1WHM2_4FD8VALDLmS1Q0-UmJDZejRMuxTyzrdTssrbpqCBhFHJtb2H2Nys5-JLLjtsiOkPSblcmVdSlPmTNcI-rE8RfcPuPq0FFFdZVVR1YO3Uxds4C8XZfa8lLCJO2TGA4_3Vbr_31tL/s640/suggest2.jpg" width="640" /></a></div>
<br />
I believe that this feature is useful to explore a subject if you are not familiar with the topic, and to make sure that your next paper has a solid "Related Works" section where the most important papers for your research activity are mentioned.<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-4516846746675397032015-11-11T09:07:00.000-08:002015-11-11T09:07:00.039-08:00Stats is bigdata<div style="background-color: white; text-align: justify;">
<b style="color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px;">(reposted from </b><span style="background-color: transparent; font-size: 14.4px; line-height: 20.16px;"><span style="color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif;"><b><a href="http://blog.mendeley.com/academic-features/new-research-features-on-mendeley-com/">http://blog.mendeley.com/academic-features/new-research-features-on-mendeley-com/</a> )</b></span></span></div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
<strong><br /></strong></div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
<strong>Feature: Stats</strong><br />If you are a published author, Mendeley’s “Stats” feature provides you with a unique, aggregated view of how your published articles are performing in terms of citations, Mendeley sharing, and (depending on who your article was published with) downloads/views. You can also drill down into each of your published articles to see the statistics on each item you have published. This powerful tool allows you to see how your work is being used by the scientific community, using data from a number of sources including Mendeley, Scopus, NewsFlo, and ScienceDirect.</div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
<a href="http://static.blog.mendeley.com/wp-content/uploads/2015/11/HybridProfile.png" style="color: #b85b5a; text-decoration: none;"><img alt="HybridProfile" class="wp-image-47690 aligncenter" height="1580" src="http://static.blog.mendeley.com/wp-content/uploads/2015/11/HybridProfile.png" style="border: none; max-width: 100%; padding: 0px;" width="560" /></a></div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
Stats gives you an aggregated view on the performance of your publications, including metrics such as citations, Mendeley readership and group activity, academic discipline and status of your readers, as well as any mentions in the news media – helping you to understand and evaluate the impact of your published work. With our integration with ScienceDirect, you can find information on views (PDF and HMTL downloads), search terms used to get to your article, geographic distribution of your readership, and links to various source data providers.</div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
Please keep in mind that Stats are only available for some published authors whose works are listed in the Scopus citation database. To find out if your articles are included, just visit <a href="https://www.mendeley.com/stats/?utm_source=blog.mendeley.com/academic-features/new-research-features-on-mendeley-com&utm_medium=referrall&utm_campaign=blogstats" style="color: #b85b5a; text-decoration: none;">www.mendeley.com/stats</a> and begin the process of claiming your Scopus author profile. If not, please be patient as we work further on this feature.</div>
<div>
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-17511504281206837252015-11-10T03:16:00.000-08:002015-11-08T03:53:34.543-08:00Satisfying the exploratory search needs : poster query {dyscalculia}<b>{dyscalculia} </b>is severe difficulty in making arithmetical calculations, as a result of brain disorder.This is scientific term for a cognitive problem associated to 3%-6% of the world population. Therefore, many people are interested in better understanding the topic.
<br />
<br />
<div class="separator" style="clear: both; text-align: justify;">
Google Scholar returns Elsevier content from <u>1992 and 1985 and from Wiley 1996.</u></div>
<div class="separator" style="clear: both; text-align: justify;">
<u><br /></u></div>
Undoubtedly, Science made significant progress in the last 9 years but this progress is not easily found in Google Scholar for this query.
<br />
<div class="separator" style="clear: both; text-align: justify;">
<span style="background-color: white; color: #222222; font-family: "arial" , , sans-serif; font-size: x-small; line-height: 15.6px; text-align: left;"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS1VofgcJaNpOF8aWjG6U3qPaxHdfIv1WKQh7SNlQK_mOYAP7Vf9WMVOihHHRpo_pyc9SbLfksiqLocg4cx23IZ4dfakfW3IIn0rz2dT2ihjQiVqtRSkUZQ11rlja0xuCLNo_H9-8laCqF/s1600/dyscalculia_scholar.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="436" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS1VofgcJaNpOF8aWjG6U3qPaxHdfIv1WKQh7SNlQK_mOYAP7Vf9WMVOihHHRpo_pyc9SbLfksiqLocg4cx23IZ4dfakfW3IIn0rz2dT2ihjQiVqtRSkUZQ11rlja0xuCLNo_H9-8laCqF/s640/dyscalculia_scholar.jpg" width="640" /></a></div>
<br />
ScienceDirect finds fresh Elsevier's content for <b style="color: #222222; font-family: arial, sans-serif-light, sans-serif; font-size: small; line-height: 15.6px;">{dyscalculia} </b>including books, and articles. All the results are from 2015, and 2016 (pre-print)
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfvDZ2krv_4UiXREkOg-4uZEJ9FXu737QpfZs7djvPIuV2F9rdJ1L0bKJ-9xrKryWxths5RMJBV64ZNK2dXqoZ6xFNRGaHwuc-Qv4nmBPEZ40aQk4gNulkXEhYpz-9zXRaPRtmaKOVODLo/s1600/dyscalculia_sd.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="420" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfvDZ2krv_4UiXREkOg-4uZEJ9FXu737QpfZs7djvPIuV2F9rdJ1L0bKJ-9xrKryWxths5RMJBV64ZNK2dXqoZ6xFNRGaHwuc-Qv4nmBPEZ40aQk4gNulkXEhYpz-9zXRaPRtmaKOVODLo/s640/dyscalculia_sd.jpg" width="640" /></a></div>
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-11997033317943283382015-11-09T03:10:00.000-08:002015-11-08T03:46:05.270-08:00New research features on Mendeley.com - Recommends (posted in <a href="http://blog.mendeley.com/academic-features/new-research-features-on-mendeley-com/">http://blog.mendeley.com/academic-features/new-research-features-on-mendeley-com/ </a><br />
<br />
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
Mendeley’s Data Science team have been working to crack one of the hardest “big data” problems of all: How to recommend interesting articles that users might want to read? For the past six months they have been working to integrate 6 large data sets from 3 different platforms to create the basis for a recommender system. These data sets often contain tens of millions of records each, and represent different dimensions which can all be applied to the problem of understanding what a user is looking for, and providing them with a high-quality set of recommendations.</div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
<br /></div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
With the (quite literally) massive base data set in place, the team then tested over 50 different recommender algorithms against a “gold standard” (which was itself revised five times for the best possible accuracy). Over 500 experiments have been done to tweak our algorithms so they can deliver the best possible recommendations. The basic principle is to combine our vast knowledge of what users are storing in their Mendeley libraries, combined with the richness of the citation graph (courtesy of Scopus), with a predictive model that can be validated against what users actually did. The end result is a tailored set of recommendations for each user who has a minimum threshold of documents in their library.</div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://static.blog.mendeley.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-03-at-15.08.26.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://static.blog.mendeley.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-03-at-15.08.26.png" height="608" width="640" /></a></div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
We are happy to report that two successive rounds of qualitative user testing have indicated that 80% of our test users rated the quality of their tailored recommendations as “Very good” (43%) or “Good” (37%), which gives us confidence that the vast majority of Mendeley reference management users will receive high-quality recommendations that will save them time in discovering important papers they should be reading.</div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
<br /></div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
For those who are new to Mendeley, we have made it easy for you to get started and import your documents – simply drag-and-drop your papers, and get high-quality recommendations.</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://static.blog.mendeley.com/wp-content/uploads/2015/11/fine-tune-suggest.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://static.blog.mendeley.com/wp-content/uploads/2015/11/fine-tune-suggest.png" height="196" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
On our new “Suggest” page you’ll be getting improved article suggestions, driven by four different recommendation algorithms to support different scientific needs:</div>
<ul style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; list-style: none; margin-left: 0px; padding: 0px 0px 0px 10px; text-align: justify; text-indent: -10px;">
<li style="list-style-position: inside; list-style-type: disc; margin: 7px 0px 8px 10px;"><em>Popular in your discipline</em> – Shows you the seminal works, for all time, in your field</li>
<li style="list-style-position: inside; list-style-type: disc; margin: 7px 0px 8px 10px;"><em>Trending in your discipline</em> – Shows you what articles are popular right now in your discipline</li>
<li style="list-style-position: inside; list-style-type: disc; margin: 7px 0px 8px 10px;"><em>Based on the last document in your library</em> – Gives you articles similar to the one you just added</li>
<li style="list-style-position: inside; list-style-type: disc; margin: 7px 0px 8px 10px;"><em>Based on all the documents in your library</em> – Provides the most tailored set of recommended articles by comparing the contents of your library with the contents of all other users on Mendeley.</li>
</ul>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
Suggestions you receive will be frequently recalculated and tailored to you based on the contents of your library, making sure that there is always something new for you to discover. This is no insignificant task, as we are calculated over 25 million new recommendations with each iteration. This means that even if you don’t add new documents to your library, you will still get new recommendations based on the activity of other Mendeley users with libraries similar to yours.</div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
<br /></div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
To find your recommended articles, check out <a href="https://www.mendeley.com/suggest/?utm_source=blog.mendeley.com/academic-features/new-research-features-on-mendeley-com&utm_medium=referrall&utm_campaign=blogsuggest" style="color: #b85b5a; text-decoration: none;">www.mendeley.com/suggest</a> and begin the discover new papers in your field!</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div style="background-color: white; color: #333333; font-family: Arial, Helvetica, Verdana, sans-serif; font-size: 14.4px; line-height: 20.16px; text-align: justify;">
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-62836450815045180902015-11-08T22:51:00.000-08:002015-11-08T03:46:28.698-08:00Academic Search and Relevance: basic normalization for matchingOne more post about Academic Search and Relevance. This time around is back to the basics: there is little you can do for relevance if you do not match the article first. In order to do so, you need to assume that users will make mistakes while they write. So you need to be proactive and correct those mistakes on their behalf. Let's see a few examples<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjlhhQ6iBwSMuA3NDzGerRK96v0aBJ7NTTbDEOyun9sAr1btbIvbKDDRW0-RCdxXY4ES19UePYWc4aYRtvwW62Yc1nx0XKP1LaK0RxMbpzfX_7gIOuRLpaZJIAPWK8U1pMwElmmeMPsnWT/s1600/anal1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="127" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjlhhQ6iBwSMuA3NDzGerRK96v0aBJ7NTTbDEOyun9sAr1btbIvbKDDRW0-RCdxXY4ES19UePYWc4aYRtvwW62Yc1nx0XKP1LaK0RxMbpzfX_7gIOuRLpaZJIAPWK8U1pMwElmmeMPsnWT/s400/anal1.jpg" width="400" /></a></div>
<br />
<div style="text-align: justify;">
Here the mistake is made on purpose for simulating a user with a different keyboard. The search should automatically support normalization, which does not. ScienceDirect does.</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOz4xVH1r-R0Ft6Syl5fHoLnEAy86zjtaO0xNmFYE_pILotmnRCo5VmB1_1zdElbF1SE2CYgEZ8Nn79yd5loIjnsipIDLFbwBoOVcNSaHcwQrQ3E0t2DLLLS5mBbgX68GtqhJujFoeM1UA/s1600/anal2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="177" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOz4xVH1r-R0Ft6Syl5fHoLnEAy86zjtaO0xNmFYE_pILotmnRCo5VmB1_1zdElbF1SE2CYgEZ8Nn79yd5loIjnsipIDLFbwBoOVcNSaHcwQrQ3E0t2DLLLS5mBbgX68GtqhJujFoeM1UA/s640/anal2.jpg" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Here the idea is to search a specific item related to prostate cancer and named {ARN-509}, By mistake, it is written as {ARN \space -509} and no match is given.</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjv9zX9vxLws-ewm-kkpWwoFgvy28bKKnP4vL-I723-5SbMxKtl77Vnz4eXoLKZ2xxSiyrhUNP_zyx2xnllkEjQQqeRrRRda14XPCTVYb-T0ry5nMmDtdRIjO7ksS1eN1SxcwSQGHE4DUck/s1600/arn1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="353" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjv9zX9vxLws-ewm-kkpWwoFgvy28bKKnP4vL-I723-5SbMxKtl77Vnz4eXoLKZ2xxSiyrhUNP_zyx2xnllkEjQQqeRrRRda14XPCTVYb-T0ry5nMmDtdRIjO7ksS1eN1SxcwSQGHE4DUck/s640/arn1.jpg" width="640" /></a></div>
<br />
<br />
ScienceDirect simply match it regardless of the mistake.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwSpAhQZOzRryhOiVYxjxdmmWcINVakcXyrrUoeblCxPA4U0As4CawC1DWBPfsdkR8RAEEsfJDb4Zn7-LsTlOKN7_WZxfR8FAAE00Vr5wAbcy1knAYyhHMkPeq7WMLGOBvlGjzmt8EbTB2/s1600/arn2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwSpAhQZOzRryhOiVYxjxdmmWcINVakcXyrrUoeblCxPA4U0As4CawC1DWBPfsdkR8RAEEsfJDb4Zn7-LsTlOKN7_WZxfR8FAAE00Vr5wAbcy1knAYyhHMkPeq7WMLGOBvlGjzmt8EbTB2/s640/arn2.png" width="640" /></a></div>
<br />
while Google matches only with the exact term. In this case, they are not able to correct the user's mistake automatically<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjATXHbgKjoiGiAlpBKSXSX7YMUN5CseeTkzzllRCnIdkPQicptZA0IoSXGbK2EjD54rK_nWVpJd16JQ9_KbFFG8PQDmPtgvGSHW-VrOxJP3WMWOVdqMLb0OLFHBBsHVR5U71jdlW-8km2d/s1600/arn3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="244" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjATXHbgKjoiGiAlpBKSXSX7YMUN5CseeTkzzllRCnIdkPQicptZA0IoSXGbK2EjD54rK_nWVpJd16JQ9_KbFFG8PQDmPtgvGSHW-VrOxJP3WMWOVdqMLb0OLFHBBsHVR5U71jdlW-8km2d/s640/arn3.png" width="640" /></a></div>
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-84878533618872152462015-11-07T22:49:00.001-08:002015-11-07T22:49:16.814-08:00TOC for my new book: A collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark Table of Contents<br />
1.<span class="Apple-tab-span" style="white-space: pre;"> </span>Why is Cross Validation important?<span class="Apple-tab-span" style="white-space: pre;"> </span>11<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>11<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>11<br />
2.<span class="Apple-tab-span" style="white-space: pre;"> </span>Why is Grid Search important?<span class="Apple-tab-span" style="white-space: pre;"> </span>12<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>12<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>12<br />
3.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are the new Spark DataFrame and the Spark Pipeline? And how we can use the new ML library for Grid Search<span class="Apple-tab-span" style="white-space: pre;"> </span>13<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>13<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>14<br />
4.<span class="Apple-tab-span" style="white-space: pre;"> </span>How to deal with categorical features? And what is one-hot-encoding?<span class="Apple-tab-span" style="white-space: pre;"> </span>16<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>16<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>17<br />
5.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are generalized linear models and what is an R Formula?<span class="Apple-tab-span" style="white-space: pre;"> </span>18<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>18<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>18<br />
6.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are the Decision Trees?<span class="Apple-tab-span" style="white-space: pre;"> </span>19<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>19<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>21<br />
7.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are the Ensembles?<span class="Apple-tab-span" style="white-space: pre;"> </span>22<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>22<br />
8.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a Gradient Boosted Tree?<span class="Apple-tab-span" style="white-space: pre;"> </span>22<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>22<br />
9.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a Gradient Boosted Trees Regressor?<span class="Apple-tab-span" style="white-space: pre;"> </span>23<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>23<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>23<br />
10.<span class="Apple-tab-span" style="white-space: pre;"> </span>Gradient Boosted Trees Classification<span class="Apple-tab-span" style="white-space: pre;"> </span>24<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>24<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>25<br />
11.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a Random Forest?<span class="Apple-tab-span" style="white-space: pre;"> </span>26<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>26<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>26<br />
12.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is an AdaBoost classification algorithm?<span class="Apple-tab-span" style="white-space: pre;"> </span>27<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>27<br />
13.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a recommender system?<span class="Apple-tab-span" style="white-space: pre;"> </span>28<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>28<br />
14.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a collaborative filtering ALS algorithm?<span class="Apple-tab-span" style="white-space: pre;"> </span>29<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>29<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>30<br />
15.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is the DBSCAN clustering algorithm?<span class="Apple-tab-span" style="white-space: pre;"> </span>32<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>32<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>32<br />
16.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a Streaming K-Means?<span class="Apple-tab-span" style="white-space: pre;"> </span>33<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>33<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>34<br />
17.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Canopi Clusterting?<span class="Apple-tab-span" style="white-space: pre;"> </span>34<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>34<br />
18.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Bisecting K-Means?<span class="Apple-tab-span" style="white-space: pre;"> </span>35<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>35<br />
19.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is the PCA Dimensional reduction technique?<span class="Apple-tab-span" style="white-space: pre;"> </span>36<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>36<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>37<br />
20.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is the SVD Dimensional reduction technique?<span class="Apple-tab-span" style="white-space: pre;"> </span>38<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>38<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>38<br />
21.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Latent Semantic Analysis (LSA)?<span class="Apple-tab-span" style="white-space: pre;"> </span>39<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>39<br />
22.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Parquet?<span class="Apple-tab-span" style="white-space: pre;"> </span>39<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>39<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>39<br />
23.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is the Isotonic Regression?<span class="Apple-tab-span" style="white-space: pre;"> </span>40<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>40<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>40<br />
24.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is LARS?<span class="Apple-tab-span" style="white-space: pre;"> </span>41<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>41<br />
25.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is GMLNET?<span class="Apple-tab-span" style="white-space: pre;"> </span>42<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>42<br />
26.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is SVM with soft margins?<span class="Apple-tab-span" style="white-space: pre;"> </span>43<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>43<br />
27.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is the Expectation Maximization Clustering algorithm?<span class="Apple-tab-span" style="white-space: pre;"> </span>44<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>44<br />
28.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a Gaussian Mixture?<span class="Apple-tab-span" style="white-space: pre;"> </span>45<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>45<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>45<br />
29.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is the Latent Dirichlet Allocation topic model?<span class="Apple-tab-span" style="white-space: pre;"> </span>46<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>46<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>47<br />
30.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is the Associative Rule Learning?<span class="Apple-tab-span" style="white-space: pre;"> </span>48<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>48<br />
31.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is FP-growth?<span class="Apple-tab-span" style="white-space: pre;"> </span>50<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>50<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>50<br />
32.<span class="Apple-tab-span" style="white-space: pre;"> </span>How to use the GraphX Library?<span class="Apple-tab-span" style="white-space: pre;"> </span>50<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>50<br />
33.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is PageRank? And how to compute it with GraphX<span class="Apple-tab-span" style="white-space: pre;"> </span>51<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>51<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>52<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>52<br />
34.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Power Iteration Clustering?<span class="Apple-tab-span" style="white-space: pre;"> </span>54<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>54<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>54<br />
35.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a Perceptron?<span class="Apple-tab-span" style="white-space: pre;"> </span>55<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>55<br />
36.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is an ANN (Artificial Neural Network)?<span class="Apple-tab-span" style="white-space: pre;"> </span>56<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>56<br />
37.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are the activation functions?<span class="Apple-tab-span" style="white-space: pre;"> </span>57<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>57<br />
38.<span class="Apple-tab-span" style="white-space: pre;"> </span>How many types of Neural Networks are known?<span class="Apple-tab-span" style="white-space: pre;"> </span>58<br />
39.<span class="Apple-tab-span" style="white-space: pre;"> </span>How can you train a Neural Network<span class="Apple-tab-span" style="white-space: pre;"> </span>59<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>59<br />
40.<span class="Apple-tab-span" style="white-space: pre;"> </span>What application have the ANNs?<span class="Apple-tab-span" style="white-space: pre;"> </span>59<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>59<br />
41.<span class="Apple-tab-span" style="white-space: pre;"> </span>Can you code a simple ANNs in python?<span class="Apple-tab-span" style="white-space: pre;"> </span>60<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>60<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>60<br />
42.<span class="Apple-tab-span" style="white-space: pre;"> </span>What support has Spark for Neural Networks?<span class="Apple-tab-span" style="white-space: pre;"> </span>61<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>61<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>62<br />
43.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Deep Learning?<span class="Apple-tab-span" style="white-space: pre;"> </span>63<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>63<br />
44.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are autoencoders and stacked autoencoders?<span class="Apple-tab-span" style="white-space: pre;"> </span>68<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>68<br />
45.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are convolutional neural networks?<span class="Apple-tab-span" style="white-space: pre;"> </span>69<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>69<br />
46.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are Restricted Boltzmann Machines, Deep Belief Networks and Recurrent networks?<span class="Apple-tab-span" style="white-space: pre;"> </span>70<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>70<br />
47.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is pre-training?<span class="Apple-tab-span" style="white-space: pre;"> </span>71<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>71<br />
48.<span class="Apple-tab-span" style="white-space: pre;"> </span>An example of Deep Learning with nolearn and Lasagne package<span class="Apple-tab-span" style="white-space: pre;"> </span>72<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>72<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>73<br />
Outcome<span class="Apple-tab-span" style="white-space: pre;"> </span>73<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>74<br />
49.<span class="Apple-tab-span" style="white-space: pre;"> </span>Can you compute an embedding with Word2Vec?<span class="Apple-tab-span" style="white-space: pre;"> </span>75<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>75<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>76<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>77<br />
50.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are Radial Basis Networks?<span class="Apple-tab-span" style="white-space: pre;"> </span>77<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>77<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>78<br />
51.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are Splines?<span class="Apple-tab-span" style="white-space: pre;"> </span>78<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>78<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>78<br />
52.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are Self-Organized-Maps (SOMs)?<span class="Apple-tab-span" style="white-space: pre;"> </span>78<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>78<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>79<br />
53.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Conjugate Gradient?<span class="Apple-tab-span" style="white-space: pre;"> </span>79<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>79<br />
54.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is exploitation-exploration? And what is the armed bandit method?<span class="Apple-tab-span" style="white-space: pre;"> </span>80<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>80<br />
55.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Simulated Annealing?<span class="Apple-tab-span" style="white-space: pre;"> </span>81<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>81<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>81<br />
56.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a Monte Carlo experiment?<span class="Apple-tab-span" style="white-space: pre;"> </span>81<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>81<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>82<br />
57.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is a Markov Chain?<span class="Apple-tab-span" style="white-space: pre;"> </span>83<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>83<br />
58.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Gibbs sampling?<span class="Apple-tab-span" style="white-space: pre;"> </span>83<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>83<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>84<br />
59.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Locality Sensitive Hashing (LSH)?<span class="Apple-tab-span" style="white-space: pre;"> </span>84<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>84<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>85<br />
60.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is minHash?<span class="Apple-tab-span" style="white-space: pre;"> </span>85<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>85<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>86<br />
61.<span class="Apple-tab-span" style="white-space: pre;"> </span>What are Bloom Filters?<span class="Apple-tab-span" style="white-space: pre;"> </span>86<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>86<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>87<br />
62.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Count Min Sketches?<span class="Apple-tab-span" style="white-space: pre;"> </span>87<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>87<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>87<br />
63.<span class="Apple-tab-span" style="white-space: pre;"> </span>How to build a news clustering system<span class="Apple-tab-span" style="white-space: pre;"> </span>88<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>88<br />
64.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is A/B testing?<span class="Apple-tab-span" style="white-space: pre;"> </span>89<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>89<br />
65.<span class="Apple-tab-span" style="white-space: pre;"> </span>What is Natural Language Processing?<span class="Apple-tab-span" style="white-space: pre;"> </span>90<br />
Solution<span class="Apple-tab-span" style="white-space: pre;"> </span>90<br />
Code<span class="Apple-tab-span" style="white-space: pre;"> </span>90<br />
Outcome<span class="Apple-tab-span" style="white-space: pre;"> </span>92<br />
66.<span class="Apple-tab-span" style="white-space: pre;"> </span>Where to go from here<span class="Apple-tab-span" style="white-space: pre;"> </span>92<br />
Appendix A<span class="Apple-tab-span" style="white-space: pre;"> </span>95<br />
67.<span class="Apple-tab-span" style="white-space: pre;"> </span>Ultra-Quick introduction to Python<span class="Apple-tab-span" style="white-space: pre;"> </span>95<br />
68.<span class="Apple-tab-span" style="white-space: pre;"> </span>Ultra-Quick introduction to Probabilities<span class="Apple-tab-span" style="white-space: pre;"> </span>96<br />
69.<span class="Apple-tab-span" style="white-space: pre;"> </span>Ultra-Quick introduction to Matrices and Vectors<span class="Apple-tab-span" style="white-space: pre;"> </span>97<br />
70.<span class="Apple-tab-span" style="white-space: pre;"> </span>Ultra-Quick summary of metrics<span class="Apple-tab-span" style="white-space: pre;"> </span>98<br />
Classification Metrics<span class="Apple-tab-span" style="white-space: pre;"> </span>98<br />
Clustering Metrics<span class="Apple-tab-span" style="white-space: pre;"> </span>99<br />
Scoring Metrics<span class="Apple-tab-span" style="white-space: pre;"> </span>99<br />
Rank Correlation Metrics<span class="Apple-tab-span" style="white-space: pre;"> </span>99<br />
Probability Metrics<span class="Apple-tab-span" style="white-space: pre;"> </span>100<br />
Ranking Models<span class="Apple-tab-span" style="white-space: pre;"> </span>100<br />
71.<span class="Apple-tab-span" style="white-space: pre;"> </span>Comparison of different machine learning techniques<span class="Apple-tab-span" style="white-space: pre;"> </span>101<br />
Linear regression<span class="Apple-tab-span" style="white-space: pre;"> </span>101<br />
Logistic regression<span class="Apple-tab-span" style="white-space: pre;"> </span>101<br />
Support Vector Machines<span class="Apple-tab-span" style="white-space: pre;"> </span>101<br />
Clustering<span class="Apple-tab-span" style="white-space: pre;"> </span>102<br />
Decision Trees, Random Forests, and GBTs<span class="Apple-tab-span" style="white-space: pre;"> </span>102<br />
Associative Rules<span class="Apple-tab-span" style="white-space: pre;"> </span>102<br />
Neural Networks and Deep Learning<span class="Apple-tab-span" style="white-space: pre;"> </span>103<br />
<div>
<br /></div>
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-4700235229527323442015-11-07T08:07:00.000-08:002015-11-08T03:46:59.581-08:00The art of news clustering: modern metrics for the ReserchersTeam shipped another cool feature. Nowadays, Modern researcher is not limited to academic papers and the labs. Nowadays break-through research is mentioned by the news sources and there are articles published in generalist magazines and newspapers talking about the progress made by Science in all the disciplines.<br />
<br />
One key aspect is to have fast algorithms based on machine learning and data analysis for grouping related articles as soon as they are published. In this way, Data science can help to infer the importance of each piece of information.<br />
<br />
My group recently acquired <a href="http://www.wired.co.uk/news/archive/2015-01/12/elsevier-acquires-newsflo">Newsflo</a> an innovative company in London and, together, we shipped an engine for clustering news articles mentioning Academic papers and research . This engine is progressively shipped in all Elsevier's products. Here it is the integration with<a href="https://www.myresearchdashboard.com/carsclient/loginfull"> myresearchdashboard.com</a><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZmg6RNLk9Z6nYKaNFiivSMFaRj4GtpWyWBcVPnb4yqHF2DLHQilMfYzC93u3Kts0H3r4lKARS6kJPQRq45BxqfmgURuuhMB_4V32PMaP1V09LhZW6hz76pTw0QOU4qVZAA6ZlaIt4Au4x/s1600/news.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="552" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZmg6RNLk9Z6nYKaNFiivSMFaRj4GtpWyWBcVPnb4yqHF2DLHQilMfYzC93u3Kts0H3r4lKARS6kJPQRq45BxqfmgURuuhMB_4V32PMaP1V09LhZW6hz76pTw0QOU4qVZAA6ZlaIt4Au4x/s640/news.png" width="640" /></a></div>
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-14301642492188644002015-11-06T04:23:00.000-08:002015-11-08T03:47:17.500-08:00Search terms as an automatic way to annotate scientific articlesAnother feature has been shipped by the team.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKMkB0cpW25AKmu2cJHSaF8rx1nsamAVPOh1PN-LesBTX6xO5VhJdmVyAb6K804K5d-TQ12jwbBYKVAm-YenEGU8Bkhr0Ex33PKRm8JUJebe1ZUi2qgpvFEFEnddVbu5ceBfn3DdUK_MRg/s1600/unnamed.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKMkB0cpW25AKmu2cJHSaF8rx1nsamAVPOh1PN-LesBTX6xO5VhJdmVyAb6K804K5d-TQ12jwbBYKVAm-YenEGU8Bkhr0Ex33PKRm8JUJebe1ZUi2qgpvFEFEnddVbu5ceBfn3DdUK_MRg/s640/unnamed.jpg" width="578" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Search terms are an automatic way to annotate scientific articles. Here we show the aggregated (e.g. anonymized) queries which were submitted by the user for retrieving my article</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<a href="https://www.mendeley.com/stats/articles/22334278000/2-s2.0-34748866005" target="_blank">https://www.mendeley.com/<wbr></wbr>stats/articles/22334278000/2-<wbr></wbr>s2.0-34748866005</a> </div>
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-62023930946659380692015-11-05T00:59:00.000-08:002015-11-03T00:59:36.758-08:00How to build a news clustering system<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">(excerpt from my new book, question #65)</span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">News clustering is a hard problem to be solved.
News articles are typically arriving to our clustering engine in a continuous
streaming fashion. Therefore, a plain vanilla batch approach is not feasible.
For instance, the simple idea of using k-means cannot work for two reasons.
First, it is not possible to know the number of clusters a-priori because the
topics are dynamically evolving. Second, the articles themselves are not
available a-priori. Therefore, more sophisticate strategies are required. <o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">One initial idea is to split the data in
mini-batches (perhaps processed with Spark Streaming) and to clusters the
content of each mini-batch independently. Then, clusters of different epochs (e.g.
mini-batches) can be chained together.<o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">An additional intuition is to start with
k-seeds and then extend those initial k-clusters whenever a new article that is
not similar enough to the initial groups arrives. In this way, new clusters are
dynamically created when needed. In one additional variant, we could think
about re-clustering all the articles after a number of epochs under the
assumption that this will improve our target metric.<o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">In addition to that, we can have a look to the
data and perhaps notice that many articles are near-duplicates. Hence, we could
aim at reducing the computational complexity by applying pseudo-linear techniques
such as minHash shingling. <o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">More sophisticate methods will aim at ranking
the articles by importance. This is an even harder problem again because of the
dynamic nature of the content and the absence of links, which could have allowed
PageRank-type computations. If that is not possible, then a two-layer model
could be considered where the importance of a news article depends on the importance
of the originating news sources, which in turns depends on the importance of
the emitted articles. It can be proven that this recurrent definition has a fixed-point
solution<a href="file:///C:/Users/agull_000/Dropbox/book/31%20oct%20-%20volume2%20-%20%20A%20collection%20of%20Data%20Science%20and%20%20Machine%20Learning%20%20Interview%20Questions%20Solved%20in%20Python%20and%20Spark.docx#_ftn1" name="_ftnref1" title=""><span class="MsoFootnoteReference"><!--[if !supportFootnotes]--><span class="MsoFootnoteReference"><span lang="EN-US" style="font-family: "calibri" , sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-US; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT;">[1]</span></span><!--[endif]--></span></a>. <o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">Even more sophisticate methods can aim at extracting
entities from the clusters and this is typically achieved by running a topic
model detection on the evolving mini-batches.<o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://www.infobarrel.com/media/image/54054.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://www.infobarrel.com/media/image/54054.jpg" height="205" width="400" /></a></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<br />
<div>
<!--[if !supportFootnotes]--><br clear="all" />
<hr align="left" size="1" width="33%" />
<!--[endif]-->
<br />
<div id="ftn1">
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<a href="file:///C:/Users/agull_000/Dropbox/book/31%20oct%20-%20volume2%20-%20%20A%20collection%20of%20Data%20Science%20and%20%20Machine%20Learning%20%20Interview%20Questions%20Solved%20in%20Python%20and%20Spark.docx#_ftnref1" name="_ftn1" title=""><span class="MsoFootnoteReference"><span lang="EN-GB"><!--[if !supportFootnotes]--><span class="MsoFootnoteReference"><span lang="EN-GB" style="font-family: "calibri" , sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT;">[1]</span></span><!--[endif]--></span></span></a><span lang="EN-GB"> </span><span lang="EN-US" style="font-family: "verdana" , sans-serif; font-size: 10.0pt; mso-ansi-language: EN-US;">Ranking a stream of news, </span><span lang="EN-US" style="font-family: "verdana" , sans-serif; font-size: 10pt;">WWW '05 Proceedings of the 14th
international conference on World Wide Web, G. M. Del Corso, A. Gulli, F.
Romani.<o:p></o:p></span></div>
</div>
</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-52880825389185339492015-11-04T00:56:00.000-08:002015-11-08T03:47:55.682-08:00Disrupting the Academic Research Arena with Recommenders We leave in a world where the <b>quantity of available information is hugely massive</b>. How many movies, songs, news articles, apps are out of there and what is the best way to find content relevant for every single user?<br />
<br />
<b>Search</b> is not <i>the only</i> solution. Search assumes that you are already aware of what you are looking for. Perhaps, you <i>already </i>heard the latest song made by Nicky Jam and you search <a href="https://www.youtube.com/results?search_query=esto+no+me+gusta">few words</a>, or you want to see the latest movie of Paolo Sorrentino and so you search the <a href="http://www.bing.com/search?q=youth+la+giovinezza">title</a>. However, the problem is that you need to know in advance what you are looking for and, then, explicitly submit a query to <i>pull </i>(retrieve) the content. What if there is some piece of information which is very relevant but you are not aware of it? Search will not necessarily help.<br />
<br />
For overcoming this limitation, Netflix, Spotify, Google Play, Apple Genius, Amazon they use recommender based technologies that are used to suggest fresh and relevant information to the users with no need of explicitly submitting queries. You can watch your favourite movie, listen your song, read your news articles, and discovery new items to buy even if you are not aware of what is relevant for you in advance.<br />
<br />
Surprisingly enough, recommenders are still not yet largely adopted by the Research Communities. How many new and fresh papers are relevant for your research discipline and how long it takes to discover them? Traditionally, discovery is based on word-of-mouth communications where someone in your community will suggest what paper to read and what the new research trends are. But this requires time, and time is fundamental in research. That's why we worked hard to create a break-through technology with our team in London. We needed to solve this problem and help the communities.<br />
<br />
<b>So, Team shipped an Academic Recommendation engine </b>which adopts sophisticate machine learning algorithm to learn how to discover scientific articles that are relevant for you. Moreover, <b>Recommendation is personalized </b>and it is based on your own scientific interests. <b>What is cool is that the algorithms makes recommendations tailored on you, the Researcher.</b><br />
<br />
<b>Let's see how this works. Browse <a href="http://mendeley.com/suggest/">mendeley.com/suggest/</a></b><br />
<br />
First, recommendations are based on what I have read previously and stored in my library. It's clear that I have an interest in data mining and usage statistics. Plus, there a surprising article related to some new types of research topics that was considering recently. That's the <a href="https://en.wikipedia.org/wiki/Serendipity">serendipity</a> effect.
Then, there are also recommendations based on my research discipline (Computer Science)<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi45gJM465nqLOxOJv_pX_EUlcWNhBcgYRrT112E4Y17Ti_FdXTczZutEmo7gk4nkRWdlVWbWM2GMEMv6Cc6zRftyPe3nftz71kyPgycrF66ij2wuUBZf4SaN-GGHlxx6gOrR2dLlYLLJML/s1600/sugges1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="588" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi45gJM465nqLOxOJv_pX_EUlcWNhBcgYRrT112E4Y17Ti_FdXTczZutEmo7gk4nkRWdlVWbWM2GMEMv6Cc6zRftyPe3nftz71kyPgycrF66ij2wuUBZf4SaN-GGHlxx6gOrR2dLlYLLJML/s640/sugges1.jpg" width="640" /></a><br />
<br />
More important, experimental data showed that freshness is very important for research. So, we developed a special set of recommenders focused on my own very recent research activity. In my case, this is related to different methodologies for sampling the Web size, and search - of course. Then, we also show what is trending in my discipline right now.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh25RG-dW0eFxaDy6EXd8SRqWoagc8nATF25a5XHPhQdtK_feux-p8klVuELCQ5_msZPIc_zgwghtt6k8Fx7NwE70eyfi88Wv_n88KqehdhWTTMMnb5eQiHZxsETmVtHEVAFAph8vlRVvfM/s1600/suggest2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="520" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh25RG-dW0eFxaDy6EXd8SRqWoagc8nATF25a5XHPhQdtK_feux-p8klVuELCQ5_msZPIc_zgwghtt6k8Fx7NwE70eyfi88Wv_n88KqehdhWTTMMnb5eQiHZxsETmVtHEVAFAph8vlRVvfM/s640/suggest2.jpg" style="cursor: move;" width="640" /></a><br />
<br />
Obviously, we encourage the users to interact with our system and fine tune the suggestion so that the quality of the personalized recommendations can improve over time. The more you interact, the merrier the suggestions will be.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDJjAKX0IX9xw5JT8cTouvzVf1PlqSik6l6Nhq3zjUtlVFbvtjAQsZhyphenhyphenJKRGa-TViGHDFPmUxYLdiZaDLHy5-Ept2hamCopEWrZXMR54q8nc_NertAakSKUzYgbkl4HY_NvAHfvF4nll2X/s1600/suggest3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="179" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDJjAKX0IX9xw5JT8cTouvzVf1PlqSik6l6Nhq3zjUtlVFbvtjAQsZhyphenhyphenJKRGa-TViGHDFPmUxYLdiZaDLHy5-Ept2hamCopEWrZXMR54q8nc_NertAakSKUzYgbkl4HY_NvAHfvF4nll2X/s640/suggest3.jpg" width="640" /></a><br />
<br />
So, Try this cool technology which I believe will disrupt the way in which research is done and will help researchers to save time
Antonio
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-87417528568876283242015-11-03T23:50:00.000-08:002015-11-03T07:37:58.278-08:00What is a recommender system?<h2>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5JK1-4q5L2T4F6ZuDUYgtTDu5ifZ41P3otTg3-SVjfPcAGU9ZIEIENkf_SEw7ghLuRvX03t0ykElpHLGBaJa51XPWL8tTz0D0N2kOhf7kZw05hWFRas-n1bIK0D3fQcMdX440Ocr66IrR/s1600/download.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="182" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5JK1-4q5L2T4F6ZuDUYgtTDu5ifZ41P3otTg3-SVjfPcAGU9ZIEIENkf_SEw7ghLuRvX03t0ykElpHLGBaJa51XPWL8tTz0D0N2kOhf7kZw05hWFRas-n1bIK0D3fQcMdX440Ocr66IrR/s200/download.jpg" width="200" /></a><span style="text-align: justify;">(excerpt from my new book)</span></h2>
<div>
<span style="text-align: justify;"><br /></span></div>
<div>
<span style="text-align: justify;"><br /></span></div>
Recommender
systems produce a list of recommendations such as news to read, movies to see,
music to listen, research articles to read, books to buy, and so on and so
forth. The recommendations are generated through two main approaches which are
often combined<br />
<br />
<br />
<ul>
<li><b>Collaborative filtering</b> approaches learn
a model from a user's past behaviour (items previously purchased or clicked and/or
numerical ratings attributed to those items) as well as similar choices made by
other users. The learned model is then used to predict items (or ratings for
items) that the user may have an interest in. Note that in some situations
rating and choices can be explicitly made, while in other situations those are
implicitly inferred by users’ actions. Collaborative
filtering has two variants: </li>
</ul>
<ol>
<li><b>User based collaborative filtering</b>:user’
interest is taken into account by looking for users who are somehow similar to
her. Each user is represented by a profile and different kinds of similarity
metrics can be defined. For instance, a user can be represented by a vector and
the similarity could be the cosine similarity </li>
<li><b>Item based collaborative filtering: </b>user’s
interest is directly taken into account by aggregating similar classes of
interest </li>
</ol>
<ul>
<li><b>Content-based filtering </b>approaches learn
a model based a series of features of an item in order to recommend additional
items with similar properties. For instance, a content based filtering system
can recommend an article similar to other articles seen in the past, or it can
recommend a song with a sound similar to ones implicitly liked in the past.</li>
</ul>
Recommenders have generally to deal with a bootstrap problem for suggesting recommendations
to new unseen users for whom very few information about their tastes are
available. In this case a solution could be to cluster new users according to
different criteria such us gender, age, location and/or to leverage a complete
set of signals such as time of the day, day of the week, etc. One easy approach
is to recommend what is popular where the definition of popularity could be either
global or conditioned to few and simple criteria.<br />
<br />
More sophisticate recommenders can also leverage additional structural information.
For instance an item can be referred by other items and those can contribute to
enrich the set of features. As an example, think about a scientific publication
which is referred by other scientific publication. In this case, the citation graph
is a very useful source of information for recommendations.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-69000530742452861172015-11-02T09:40:00.000-08:002015-11-08T03:48:20.281-08:00Benchmarking your Academic Profile is BigData ComputationExciting day today. <b>Let's ship it. </b>Team worked on a <b>modern BigData pipeline</b> built on <a href="http://spark.apache.org/">Apache Spark</a> for <b>helping the Researchers to benchmark their Academic Profiles</b><br />
<div>
<br /></div>
<div>
Here it is me checking how am I doing and it's clear that I've moved to the industry since have no recent publication and few recent citations.</div>
<div>
<br /></div>
<div>
First, an overall summary of Antonio's gulli views & citations over time</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdF2rbV38tr6VVy2nKfUdl-JiBpFMieAHM6nYfm-RTBOLJDEcF_Gj0vcOLVOcp_-JDr_iBvKXOUpBVYrAQtn53irJKJUwAACjOFawZB0V2ttZ3Ks17ebIoToR6cp_9An2ZcqVtbOjcQ6kd/s1600/stats1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdF2rbV38tr6VVy2nKfUdl-JiBpFMieAHM6nYfm-RTBOLJDEcF_Gj0vcOLVOcp_-JDr_iBvKXOUpBVYrAQtn53irJKJUwAACjOFawZB0V2ttZ3Ks17ebIoToR6cp_9An2ZcqVtbOjcQ6kd/s640/stats1.jpg" width="545" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Then an in-depth view of one selected article with its citations</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgonfsiOX4hvUCTnlBBo4lZlhO68i0CzH91TttUpsWN2-kxUYMeWCmKQ6raBxUqLQlD3v95p5GpJZPP3UeMhlADPTp5qrMQ4mzdErszYZtsD7wZwG9yedxipdOqrlFUGV1NDVXswBg9A1vf/s1600/stats2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgonfsiOX4hvUCTnlBBo4lZlhO68i0CzH91TttUpsWN2-kxUYMeWCmKQ6raBxUqLQlD3v95p5GpJZPP3UeMhlADPTp5qrMQ4mzdErszYZtsD7wZwG9yedxipdOqrlFUGV1NDVXswBg9A1vf/s640/stats2.jpg" width="552" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Check this out on http://mendeley.com/stats/</div>
<div>
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-41934201251579238142015-11-01T03:28:00.001-08:002015-11-08T03:48:53.350-08:00Academic Search and Relevance: deep searching what you are looking forLet's see some more examples of Academic Search and Relevance. This time around from my domain of expertise which is Machine Learning. Again side-by-side comparison and we will show why directly matching the users' needs is important.<br />
<br />
<b>{deep learning autoencoders}</b><br />
<br />
Here I am interested in finding a specific innovation discovered in deep learning. As discussed in a previous post autoencoders are deep learning machines which are able to auto-learn what are the important features in a dataset with no human intervention. The machine will pick the right features on your behalf with no handcraft work.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxkmIO_unpUCqIvG7Wi2VWDgonyHIUwinDCpFDY6X1Folkr-LY4HakuMfcNt9YX8DUIcRRm8pefD4HCqO6ymUJJe0DHqP6gYAE7IHYnIGX8g4yYF4xWueORUurX1_BwYDIRx7Rb8NgaZis/s1600/deep+learning+autoencoders.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="428" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxkmIO_unpUCqIvG7Wi2VWDgonyHIUwinDCpFDY6X1Folkr-LY4HakuMfcNt9YX8DUIcRRm8pefD4HCqO6ymUJJe0DHqP6gYAE7IHYnIGX8g4yYF4xWueORUurX1_BwYDIRx7Rb8NgaZis/s640/deep+learning+autoencoders.jpg" width="640" /></a></div>
<br />
Google returns the seminal paper from 2006 which is considered the starting point for the renaissance of Neural Networks and their evolution into modern Deep Learning systems.<br />
<br />
However, this paper <b style="text-decoration: underline;">DOES NOT</b> talk about Autoencoders, Instead, it talks about deep believe nets a slightly related topic. At the time of that paper Autoencoders where NOT YET popular for Deep Learning (and even Deep Learning was not invented as a new word yet).<br />
<br />
Therefore, I'd consider this a DSAT because it is not immediately satisfying my very specific search needs.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQNwWlheL7QaHHjTDPmlPr4EKr7cxjB-i_wD2CRHo_j2PLQApe0g1jdqJEgQrS6RbL1q80zxNtyB3ytu_oAk8iWpjkAz5_FpiUbcMhodf0qfBAVx7sc58UFwINbmJ04qj0DZ_jiUypixLf/s1600/autoencoders+no.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="440" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQNwWlheL7QaHHjTDPmlPr4EKr7cxjB-i_wD2CRHo_j2PLQApe0g1jdqJEgQrS6RbL1q80zxNtyB3ytu_oAk8iWpjkAz5_FpiUbcMhodf0qfBAVx7sc58UFwINbmJ04qj0DZ_jiUypixLf/s640/autoencoders+no.jpg" width="640" /></a></div>
<br />
So Google Scholar is not returning a very relevant result<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8kgFceTicjRYfARU6fS1zYU8R1ilBQ-ETa9RM-PlioCqKYjyUOXFiKo2hCI7dULR0xKIBj6pkn0OSp0kMfYkRVH5MnZLvIbBvYeau60uqBasINsSggYRiXaaBvzf9XX51620HNNPmgs8z/s1600/autoencoders+yes.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="442" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8kgFceTicjRYfARU6fS1zYU8R1ilBQ-ETa9RM-PlioCqKYjyUOXFiKo2hCI7dULR0xKIBj6pkn0OSp0kMfYkRVH5MnZLvIbBvYeau60uqBasINsSggYRiXaaBvzf9XX51620HNNPmgs8z/s640/autoencoders+yes.jpg" width="640" /></a></div>
<br />
ScienceDirect is instead returning a very relevant and recent results discussing about Deep Learning and Autoencoders.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-25158346442950834012015-10-31T00:16:00.000-07:002015-11-01T00:17:44.889-07:00What is Natural Language Processing?An excerpt from my new book - instant questions and answers, with code. This is #61, second volume (more than 100 in total)<br />
<br />
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">Natural Language Processing (NLP) is a complex
topic and there are books devoted only to this subject. In this book, an introductive
survey will be provided based on the NLTK a python Natural Language Toolkit. Let
us start.<o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">Text is made up of <i>sentences </i>and sentences are composed of <i>words</i>. So the first step in NLP is frequently to separate those
basic units according to the rules of the chosen language. Often very frequent
words carry little information and they should be filtered out as <i>stopwords</i>. The first code fragment split
text into sentences and then sentences into words where stop words are then
removed. <o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">In addition to that, it could be interesting to
find out the meaning of words and here wordnet<a href="file:///C:/Users/agull_000/Dropbox/book/31%20oct%20-%20volume2%20-%20%20A%20collection%20of%20Data%20Science%20and%20%20Machine%20Learning%20%20Interview%20Questions%20Solved%20in%20Python%20and%20Spark.docx#_ftn1" name="_ftnref1" title=""><span class="MsoFootnoteReference"><!--[if !supportFootnotes]--><span class="MsoFootnoteReference"><span lang="EN-US" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-US; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT;">[1]</span></span><!--[endif]--></span></a> can
help with its organization of terms into <i>synsets,
</i>which are organized into inheritance tree where the most abstract terms are
hypernyms and the more specific terms are hyponyms. Wordnet can also help in
finding <i>synonyms </i>and <i>antonyms </i>(opposite words) of a given
terms. The code fragment finds the synonyms of the word<i> love</i> in English.<o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">Moreover, words can be stemmed and the rules
for stemming are very different from language to language. NLTK supports the SnowballStemmer
that supports multiple idioms. The code fragment finds the stem of the word <i>volvi</i> in Spanish.<o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-US">In certain situations, it could be convenient
to understand whether a word is a noun, an adjective, a verb and so on. This is
the process of part-of-speech tagging and NLTK provides a convenient support for
this type of analysis as illustrated in the below code fragment.<o:p></o:p></span></div>
<h3>
<a href="https://www.blogger.com/null" name="_Toc434048413"></a><a href="https://www.blogger.com/null" name="_Toc431906469"><span lang="EN-US">Code</span></a><span lang="EN-US"><o:p></o:p></span></h3>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">import</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> nltk</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">data<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">text </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> </span><span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">"Poetry is the record of the best and happiest
moments \<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">of the happiest and best
minds. Poetry is a sword of lightning, \<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">ever unsheathed, which
consumes the scabbard that would contain it."</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;"># download stopwords</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">#nltk.download("stopwords")
</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">from</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> nltk</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">corpus </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">import</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> stopwords<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">stop </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> stopwords</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">words</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">'english'</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">)</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;"># download the punkt package</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">#nltk.download('punkt')</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;"># load the sentences'
tokenizer</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">tokenizer </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> nltk</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">data</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">load</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">'tokenizers/punkt/english.pickle'</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">)</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span style="font-family: 'Courier New'; font-size: 10pt;">sentences </span><b><span style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: IT;">=</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> tokenizer</span><b><span style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: IT;">.</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">tokenize</span><b><span style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: IT;">(</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">text</span><b><span style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: IT;">)</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">print</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> sentences<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;"># tokenize in words</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">from</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> nltk</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">tokenize </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">import</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> WordPunctTokenizer<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">tokenizer </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> WordPunctTokenizer</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">()</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">for</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> sentence </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">in</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> sentences</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">:</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> words </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> tokenizer</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">tokenize</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">sentence</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">)</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> words </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">[</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">w </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">for</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> w </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">in</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> words </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">if</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> w </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">not</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">in</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> stop</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">]</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">print</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> words <o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> <o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">#wordnet</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">#nltk.download("wordnet") </span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">from</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> nltk</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">corpus </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">import</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> wordnet<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">for</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> i</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">,</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">j </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">in</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> enumerate</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">wordnet</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">synsets</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">'love'</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">)):</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">print</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> </span><span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">"Synonyms:"</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">,</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> </span><span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">", "</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">join</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">j</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">lemma_names</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">())</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;"># SnowBallStemmer</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">from</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> nltk</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">stem </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">import</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> SnowballStemmer<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">stemmer </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> SnowballStemmer</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">'spanish'</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">)</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">print</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> </span><span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">"Spanish
stemmer"</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">print</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> stemmer</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">stem</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="color: grey; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">'volver'</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">)</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">#tagger</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="color: green; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">#nltk.download('treebank')</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">from</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> nltk</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">tag </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">import</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> UnigramTagger<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">from</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> nltk</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">corpus </span><b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">import</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> treebank<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">trainSenteces </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> treebank</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">tagged_sents</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">()[:</span></b><span lang="EN-US" style="color: red; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">5000</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">]</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">tagger </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> UnigramTagger</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">trainSenteces</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">)</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">tagged </span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">=</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> tagger</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">.</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">tag</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">(</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">words</span><b><span lang="EN-US" style="color: navy; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">)</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<b><span lang="EN-US" style="color: blue; font-family: "Courier New"; font-size: 10.0pt; mso-ansi-language: EN-US;">print</span></b><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;"> tagged </span><span lang="EN-US" style="font-family: "Times New Roman",serif; font-size: 12.0pt; mso-ansi-language: EN-US;"><o:p></o:p></span></div>
<h3>
<span lang="EN-US">Outcome<o:p></o:p></span></h3>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US">[</span><span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">'Poetry is the record of the best and happiest moments
of the happiest and best minds.', 'Poetry is a sword of lightning, ever unsh<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">eathed, which consumes the
scabbard that would contain it.']<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">['Poetry', 'record', 'best',
'happiest', 'moments', 'happiest', 'best', 'minds', '.']<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">['Poetry', 'sword',
'lightning', ',', 'ever', 'unsheathed', ',', 'consumes', 'scabbard', 'would',
'contain', '.']<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: love<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: love, passion<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: beloved, dear,
dearest, honey, love<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: love, sexual_love,
erotic_love<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: love<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: sexual_love,
lovemaking, making_love, love, love_life<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: love<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: love, enjoy<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: love<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Synonyms: sleep_together,
roll_in_the_hay, love, make_out, make_love, sleep_with, get_laid, have_sex,
know, do_it, be_intimate, have<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">_intercourse, have_it_away,
have_it_off, screw, fuck, jazz, eff, hump, lie_with, bed, have_a_go_at_it,
bang, get_it_on, bonk<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">Spanish stemmer<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">volv<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">[('Poetry', None), ('sword',
None), ('lightning', None), (',', u','), ('ever', u'RB'), ('unsheathed', None),
(',', u','), ('consumes<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: 0.0001pt;">
<span lang="EN-US" style="font-family: 'Courier New'; font-size: 10pt;">', None), ('scabbard',
None), ('would', u'MD'), ('contain', u'VB'), ('.', u'.')]<o:p></o:p></span></div>
<br />
<div>
<!--[if !supportFootnotes]--><br clear="all" />
<hr align="left" size="1" width="33%" />
<!--[endif]-->
<br />
<div id="ftn1">
<div class="MsoFootnoteText">
<a href="file:///C:/Users/agull_000/Dropbox/book/31%20oct%20-%20volume2%20-%20%20A%20collection%20of%20Data%20Science%20and%20%20Machine%20Learning%20%20Interview%20Questions%20Solved%20in%20Python%20and%20Spark.docx#_ftnref1" name="_ftn1" title=""><span class="MsoFootnoteReference"><span lang="X-NONE"><!--[if !supportFootnotes]--><span class="MsoFootnoteReference"><span lang="X-NONE" style="font-family: "Arial",sans-serif; font-size: 9.0pt; line-height: 110%; mso-ansi-language: X-NONE; mso-bidi-font-family: "Times New Roman"; mso-bidi-font-size: 12.0pt; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT;">[1]</span></span><!--[endif]--></span></span></a><span lang="X-NONE"> <a href="https://wordnet.princeton.edu/"><b>https://wordnet.princeton.edu/</b></a></span><span lang="X-NONE"> </span><span lang="EN-US"><o:p></o:p></span></div>
</div>
</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-46334245701118476562015-10-30T00:30:00.000-07:002015-10-30T00:30:00.951-07:00What are autoencoders and stacked autoencoders?<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-GB">An encoder is a
function </span><!--[if gte msEquation 12]><m:oMath><i style='mso-bidi-font-style:
normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>y</m:r><m:r>=</m:r></span></i><m:sSub><m:sSubPr><span
style='font-family:"Cambria Math",serif;mso-ascii-font-family:"Cambria Math";
mso-hansi-font-family:"Cambria Math";font-style:italic;mso-bidi-font-style:
normal'><m:ctrlPr></m:ctrlPr></span></m:sSubPr><m:e><i style='mso-bidi-font-style:
normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>f</m:r></span></i></m:e><m:sub><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>W</m:r><m:r>,</m:r><m:r>b</m:r></span></i></m:sub></m:sSub><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>(</m:r><m:r>x</m:r><m:r>)</m:r></span></i></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -5.5pt; position: relative; top: 5.5pt;"><v:shapetype coordsize="21600,21600" filled="f" id="_x0000_t75" o:preferrelative="t" o:spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f">
<v:stroke joinstyle="miter">
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0">
<v:f eqn="sum @0 1 0">
<v:f eqn="sum 0 0 @1">
<v:f eqn="prod @2 1 2">
<v:f eqn="prod @3 21600 pixelWidth">
<v:f eqn="prod @3 21600 pixelHeight">
<v:f eqn="sum @0 0 1">
<v:f eqn="prod @6 1 2">
<v:f eqn="prod @7 21600 pixelWidth">
<v:f eqn="sum @8 21600 0">
<v:f eqn="prod @7 21600 pixelHeight">
<v:f eqn="sum @10 21600 0">
</v:f></v:f></v:f></v:f></v:f></v:f></v:f></v:f></v:f></v:f></v:f></v:f></v:formulas>
<v:path gradientshapeok="t" o:connecttype="rect" o:extrusionok="f">
<o:lock aspectratio="t" v:ext="edit">
</o:lock></v:path></v:stroke></v:shapetype><v:shape id="_x0000_i1025" style="height: 14.25pt; width: 52.5pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image001.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB"> which transforms the input vector </span><!--[if gte msEquation 12]><m:oMath><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>x</m:r></span></i></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -4.0pt; position: relative; top: 4.0pt;"><v:shape id="_x0000_i1025" style="height: 13.5pt; width: 6pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image002.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB"> in the output </span><!--[if gte msEquation 12]><m:oMath><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>y</m:r></span></i></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -4.0pt; position: relative; top: 4.0pt;"><v:shape id="_x0000_i1025" style="height: 13.5pt; width: 6pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image003.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB"> where </span><!--[if gte msEquation 12]><m:oMath><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>W</m:r></span></i></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -4.0pt; position: relative; top: 4.0pt;"><v:shape id="_x0000_i1025" style="height: 13.5pt; width: 10.5pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image004.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB"> is a weight matrix and </span><!--[if gte msEquation 12]><m:oMath><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>b</m:r></span></i></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -4.0pt; position: relative; top: 4.0pt;"><v:shape id="_x0000_i1025" style="height: 13.5pt; width: 6pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image005.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB"> is an offset vector. A decoder is an inverse
function which tries to reconstruct the original vector </span><!--[if gte msEquation 12]><m:oMath><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>x</m:r><m:r>
</m:r></span></i></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -4.0pt; position: relative; top: 4.0pt;"><v:shape id="_x0000_i1025" style="height: 13.5pt; width: 8.25pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image006.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB">from y. An auto-encoder tries to
reconstruct the original input by minimizing the error during the
reconstruction process. There are two major variants for auto-encoding: Sparse
auto-encoders force sparsity by using L1 regularization, while de-noising
autoencoders stochastically corrupt the input with some form of randomization.<o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-GB">Mathematically,
a stochastic mapping transforms the input vector </span><!--[if gte msEquation 12]><m:oMath><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>x</m:r></span></i></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -4.0pt; position: relative; top: 4.0pt;"><v:shape id="_x0000_i1025" style="height: 13.5pt; width: 6pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image002.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB"> into a noisy vector </span><!--[if gte msEquation 12]><m:oMath><m:acc><m:accPr><m:chr
m:val="̃"/><span style='font-family:"Cambria Math",serif;mso-ascii-font-family:
"Cambria Math";mso-hansi-font-family:"Cambria Math";font-style:italic;
mso-bidi-font-style:normal'><m:ctrlPr></m:ctrlPr></span></m:accPr><m:e><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>x</m:r></span></i></m:e></m:acc></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -4.0pt; position: relative; top: 4.0pt;"><v:shape id="_x0000_i1025" style="height: 13.5pt; width: 6pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image007.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB"> which is then transformed into an hidden representation</span><!--[if gte msEquation 12]><m:oMath><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>
</m:r><m:r>y</m:r><m:r>=</m:r></span></i><m:sSub><m:sSubPr><span
style='font-family:"Cambria Math",serif;mso-ascii-font-family:"Cambria Math";
mso-hansi-font-family:"Cambria Math";font-style:italic;mso-bidi-font-style:
normal'><m:ctrlPr></m:ctrlPr></span></m:sSubPr><m:e><i style='mso-bidi-font-style:
normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>f</m:r></span></i></m:e><m:sub><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>W</m:r><m:r>,</m:r><m:r>b</m:r></span></i></m:sub></m:sSub><m:d><m:dPr><span
style='font-family:"Cambria Math",serif;mso-ascii-font-family:"Cambria Math";
mso-hansi-font-family:"Cambria Math";font-style:italic;mso-bidi-font-style:
normal'><m:ctrlPr></m:ctrlPr></span></m:dPr><m:e><m:acc><m:accPr><m:chr
m:val="̃"/><span style='font-family:"Cambria Math",serif;mso-ascii-font-family:
"Cambria Math";mso-hansi-font-family:"Cambria Math";font-style:italic;
mso-bidi-font-style:normal'><m:ctrlPr></m:ctrlPr></span></m:accPr><m:e><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:
"Cambria Math",serif'><m:r>x</m:r></span></i></m:e></m:acc></m:e></m:d><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>=</m:r><m:r>s</m:r><m:r>(</m:r><m:r>W</m:r></span></i><m:acc><m:accPr><m:chr
m:val="̃"/><span style='font-family:"Cambria Math",serif;mso-ascii-font-family:
"Cambria Math";mso-hansi-font-family:"Cambria Math";font-style:italic;
mso-bidi-font-style:normal'><m:ctrlPr></m:ctrlPr></span></m:accPr><m:e><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>x</m:r></span></i></m:e></m:acc><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>+</m:r><m:r>b</m:r><m:r>)</m:r></span></i></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -5.5pt; position: relative; top: 5.5pt;"><v:shape id="_x0000_i1025" style="height: 14.25pt; width: 117pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image008.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB">. The reconstruction phase is via a
decoder </span><!--[if gte msEquation 12]><m:oMath><m:sSub><m:sSubPr><span
style='font-family:"Cambria Math",serif;mso-ascii-font-family:"Cambria Math";
mso-hansi-font-family:"Cambria Math";font-style:italic;mso-bidi-font-style:
normal'><m:ctrlPr></m:ctrlPr></span></m:sSubPr><m:e><i style='mso-bidi-font-style:
normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>z</m:r><m:r>=</m:r><m:r>f</m:r></span></i></m:e><m:sub><i
style='mso-bidi-font-style:normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>W</m:r><m:r>,</m:r><m:r>b</m:r></span></i></m:sub></m:sSub><m:d><m:dPr><span
style='font-family:"Cambria Math",serif;mso-ascii-font-family:"Cambria Math";
mso-hansi-font-family:"Cambria Math";font-style:italic;mso-bidi-font-style:
normal'><m:ctrlPr></m:ctrlPr></span></m:dPr><m:e><i style='mso-bidi-font-style:
normal'><span lang=EN-GB style='font-family:"Cambria Math",serif'><m:r>y</m:r></span></i></m:e></m:d></m:oMath><![endif]--><!--[if !msEquation]--><span lang="EN-GB" style="font-family: "Calibri",sans-serif; font-size: 10.5pt; line-height: 110%; mso-ansi-language: EN-GB; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: IT; mso-text-raise: -5.5pt; position: relative; top: 5.5pt;"><v:shape id="_x0000_i1025" style="height: 14.25pt; width: 51.75pt;" type="#_x0000_t75">
<v:imagedata chromakey="white" o:title="" src="file:///C:\Users\AGULL_~1\AppData\Local\Temp\msohtmlclip1\01\clip_image009.png">
</v:imagedata></v:shape></span><!--[endif]--><span lang="EN-GB"> where an error minimization algorithm is used via
either squared error loss or cross-entropy loss. Autoencoders typically use a
hidden layer which acts as a bottleneck that compress the data as in figure.<o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-GB"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6PfqwA7AkDsLew4gVzLJX9c8Ut0w9xRrifswbdj1lDJjm3xg8bRC6ir0ZV4GqY570nG6DW7reHSQ3YeCy-8zAxsjS6ObQSkGBV9h8FuQe5TB6WjSxkgoUlXQPulkO3xrp0NvneSKJ5BzS/s1600/autoencoders.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="167" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6PfqwA7AkDsLew4gVzLJX9c8Ut0w9xRrifswbdj1lDJjm3xg8bRC6ir0ZV4GqY570nG6DW7reHSQ3YeCy-8zAxsjS6ObQSkGBV9h8FuQe5TB6WjSxkgoUlXQPulkO3xrp0NvneSKJ5BzS/s400/autoencoders.png" width="400" /></a></div>
<div class="MsoNormal" style="text-align: justify;">
<span lang="EN-GB"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
In a deep
learning context, multiple auto-encoder are stacked for producing the final
denoised output. The “magic” outcome of this combination is that autoencoders
learn how to extract meaningful features from noise data with no need
offhand-craft features’ selection. There are also additional applications. For
instance, deep autoencoders are able to map images into compressed vectors with
small dimensionality and this can be useful for searching images by image
similarity. Plus, Deep autoencoders can map words into small dimension vectors
and this is a process useful in topic modelling distributed across a collection of documents.</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6314876008291942531.post-84307930164686881112015-10-29T00:55:00.000-07:002015-11-08T03:49:44.970-08:00Academic Search and Relevance on ScienceDirectDuring the past few months team worked on Academic Search. So, it is time to post a number of side- by-side comparisons. Let's select a topic.. say <u>Modern Finance</u> and pick a few queries just to show where we are.<br />
<br />
<b>{quantitative easing}</b><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5m4vRqeqxs2gqCHS8yXNBhin3VdRg9BVzsDjj1mEaow7_uksfeHZwu8yB9MzwQ6EUlTdS5auMXS6FA3mv9Utuh4dL3P7OBY8nZLReOyCpBcxtyRq4gzCd6GxUbdGIIvRNC-l9PMaLjRhJ/s1600/quantitative+easing+SC.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="328" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5m4vRqeqxs2gqCHS8yXNBhin3VdRg9BVzsDjj1mEaow7_uksfeHZwu8yB9MzwQ6EUlTdS5auMXS6FA3mv9Utuh4dL3P7OBY8nZLReOyCpBcxtyRq4gzCd6GxUbdGIIvRNC-l9PMaLjRhJ/s640/quantitative+easing+SC.png" width="640" /></a></div>
<br />
All articles are from 2010/2011, while here instead we show fresh results which are more relevant<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-Q9toT86ctXcrBggU554eKYLuFNovy61Bv0acCmJ2hCpGkGFGOCc-WUAEJWBZC2kZfNa5fNOgVqbY0pCTIHau1wH5udjjYkdEeRch_k0V_jeXFauE0jRONgsC-w219G_sFPKrDfVIA5Ra/s1600/quantitative+easing+SD.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="288" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-Q9toT86ctXcrBggU554eKYLuFNovy61Bv0acCmJ2hCpGkGFGOCc-WUAEJWBZC2kZfNa5fNOgVqbY0pCTIHau1wH5udjjYkdEeRch_k0V_jeXFauE0jRONgsC-w219G_sFPKrDfVIA5Ra/s640/quantitative+easing+SD.png" width="640" /></a></div>
<br />
{<b>ultrafast trading</b>}<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQetB-fpB5PWS3zDrijRJsYzef3b3jwr9U9hP3TzBUIvDVGN2nDzr-s_RV3Ar2t17MX8aRyBPI08mVRV9o2Z5U0Ok8swc_QPMJ9TSrARVYapeUCQtaaJsA9xPc0oX6z-5efXpH5JID-Yj_/s1600/ultra+fast+trading+SC.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="278" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQetB-fpB5PWS3zDrijRJsYzef3b3jwr9U9hP3TzBUIvDVGN2nDzr-s_RV3Ar2t17MX8aRyBPI08mVRV9o2Z5U0Ok8swc_QPMJ9TSrARVYapeUCQtaaJsA9xPc0oX6z-5efXpH5JID-Yj_/s640/ultra+fast+trading+SC.png" width="640" /></a></div>
<br />
here there is a proximity match and an alteration problem since ultrafast trading actually means high-frequency trading and it is not that fast to give a result of 1999. ScienceDirect correctly nails it<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYym_vc7sveFN3k7fbl3U_M8MgKThNpUDsDd1HcC-Q_NY3nK5A8h9kwUpEXk8KC9SIKD8L0kWxAvnGGsNFyWO59unRM1SsqQ_CGcTSl2ubPZXSqH2Vy-mLK4dAeUp9YoB52CXcQtXLlwWB/s1600/ultrafast+trading+SD.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="328" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYym_vc7sveFN3k7fbl3U_M8MgKThNpUDsDd1HcC-Q_NY3nK5A8h9kwUpEXk8KC9SIKD8L0kWxAvnGGsNFyWO59unRM1SsqQ_CGcTSl2ubPZXSqH2Vy-mLK4dAeUp9YoB52CXcQtXLlwWB/s640/ultrafast+trading+SD.png" width="640" /></a></div>
<br />
{<b>fintech</b>}<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcF8_woR48k2unqDX1wSiYRPpYPvuhA62fZkS0hSYwneUlzSEthqqmbUIogJPiQv8BO2xFLux7fI4S9IhP-pvwYAtLHdZ-_Ot4qGm0668XZyHU2-QgaEYflMwJeMqDl8BF8fbmM2tftPft/s1600/fintech.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="358" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcF8_woR48k2unqDX1wSiYRPpYPvuhA62fZkS0hSYwneUlzSEthqqmbUIogJPiQv8BO2xFLux7fI4S9IhP-pvwYAtLHdZ-_Ot4qGm0668XZyHU2-QgaEYflMwJeMqDl8BF8fbmM2tftPft/s640/fintech.png" width="640" /></a></div>
<br />
Fintech has a very specific meaning in Finance and this meaning is not nailed. ScienceDirect got it and it is also fresh. Kinda of cool.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOQk-oYZs3BFvqHoQv0CiwbgJ9xqSjymw0rl-owILB9RvfVuuW8OGXiXitRL18HK8XCXz_u8wOr9aqz00SzzJyx_hMNZuwpQveGqznJc_ezOQy3Q1brSNxk-NNtw2cv_W6ivCpyl7iM4bu/s1600/fintech+SD.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="218" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOQk-oYZs3BFvqHoQv0CiwbgJ9xqSjymw0rl-owILB9RvfVuuW8OGXiXitRL18HK8XCXz_u8wOr9aqz00SzzJyx_hMNZuwpQveGqznJc_ezOQy3Q1brSNxk-NNtw2cv_W6ivCpyl7iM4bu/s640/fintech+SD.png" width="640" /></a></div>
<br />
Please run your own queries and report SATs and DSATs. Search and Relevance requires continuous investments and the work is never really done. There is always a metric to move, and new learnings to apply - which is why the job is fun!!<br />
<br />
Antonio Gulli<br />
<br />Unknownnoreply@blogger.com0