wbia.algo.preproc package

Submodules

wbia.algo.preproc.occurrence_blackbox module

animal_walking_speeds

ZEBRA_SPEED_MAX = 64 # km/h ZEBRA_SPEED_RUN = 50 # km/h ZEBRA_SPEED_SLOW_RUN = 20 # km/h ZEBRA_SPEED_FAST_WALK = 10 # km/h ZEBRA_SPEED_WALK = 7 # km/h

km_per_sec = .02 km_per_sec = .001 mph = km_per_sec / ut.KM_PER_MILE * 60 * 60 print(‘mph = %r’ % (mph,))

1 / km_per_sec

import datetime thresh_sec = datetime.timedelta(minutes=5).seconds thresh_km = thresh_sec * km_per_sec print(‘thresh_sec = %r’ % (thresh_sec,)) print(‘thresh_km = %r’ % (thresh_km,)) thresh_sec = thresh_km / km_per_sec print(‘thresh_sec = %r’ % (thresh_sec,))

wbia.algo.preproc.occurrence_blackbox.cluster_timespace_km(posixtimes, latlons, thresh_km, km_per_sec=0.002)[source]

Agglometerative clustering of time/space data

Parameters

X_data (ndarray) – Nx3 array where columns are (seconds, lat, lon)
thresh_km (float) – threshold in kilometers

References

http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/: scipy.cluster.hierarchy.linkage.html scipy.cluster.hierarchy.fcluster.html

Notes

# Visualize spots http://www.darrinward.com/lat-long/?id=2009879

CommandLine:

python -m wbia.algo.preproc.occurrence_blackbox cluster_timespace_km

Doctest:

>>> from wbia.algo.preproc.occurrence_blackbox import *  # NOQA
>>> # Nx1 matrix denoting groundtruth locations (for testing)
>>> X_name = np.array([0, 1, 1, 1, 1, 1, 2, 2, 2])
>>> # Nx3 matrix where each columns are (time, lat, lon)
>>> X_data = np.array([
>>>     (0, 42.727985, -73.683994),  # MRC
>>>     (0, 42.657414, -73.774448),  # Park1
>>>     (0, 42.658333, -73.770993),  # Park2
>>>     (0, 42.654384, -73.768919),  # Park3
>>>     (0, 42.655039, -73.769048),  # Park4
>>>     (0, 42.657872, -73.764148),  # Park5
>>>     (0, 42.876974, -73.819311),  # CP1
>>>     (0, 42.862946, -73.804977),  # CP2
>>>     (0, 42.849809, -73.758486),  # CP3
>>> ])
>>> thresh_km = 5.0  # kilometers
>>> posixtimes = X_data.T[0]
>>> latlons = X_data.T[1:3].T
>>> km_per_sec = KM_PER_SEC
>>> X_labels = cluster_timespace_km(posixtimes, latlons, thresh_km)
>>> result = 'X_labels = {}'.format(ut.repr2(X_labels))
>>> print(result)
X_labels = np.array([3, 2, 2, 2, 2, 2, 1, 1, 1])

wbia.algo.preproc.occurrence_blackbox.cluster_timespace_sec(posixtimes, latlons, thresh_sec=5, km_per_sec=0.002)[source]

Parameters

X_data (ndarray) – Nx3 array where columns are (seconds, lat, lon)
thresh_sec (float) – threshold in seconds

Doctest:

>>> from wbia.algo.preproc.occurrence_blackbox import *  # NOQA
>>> # Nx1 matrix denoting groundtruth locations (for testing)
>>> X_name = np.array([0, 1, 1, 1, 1, 1, 2, 2, 2])
>>> # Nx3 matrix where each columns are (time, lat, lon)
>>> X_data = np.array([
>>>     (0, 42.727985, -73.683994),  # MRC
>>>     (0, 42.657414, -73.774448),  # Park1
>>>     (0, 42.658333, -73.770993),  # Park2
>>>     (0, 42.654384, -73.768919),  # Park3
>>>     (0, 42.655039, -73.769048),  # Park4
>>>     (0, 42.657872, -73.764148),  # Park5
>>>     (0, 42.876974, -73.819311),  # CP1
>>>     (0, 42.862946, -73.804977),  # CP2
>>>     (0, 42.849809, -73.758486),  # CP3
>>> ])
>>> posixtimes = X_data.T[0]
>>> latlons = X_data.T[1:3].T
>>> thresh_sec = 250  # seconds
>>> X_labels = cluster_timespace_sec(posixtimes, latlons, thresh_sec)
>>> result = ('X_labels = %r' % (X_labels,))
>>> print(result)
X_labels = array([6, 4, 4, 4, 4, 5, 1, 2, 3])

Doctest:

>>> from wbia.algo.preproc.occurrence_blackbox import *  # NOQA
>>> # Nx1 matrix denoting groundtruth locations (for testing)
>>> X_name = np.array([0, 1, 1, 1, 1, 1, 2, 2, 2])
>>> # Nx3 matrix where each columns are (time, lat, lon)
>>> X_data = np.array([
>>>     (np.nan, 42.657414, -73.774448),  # Park1
>>>     (0, 42.658333, -73.770993),  # Park2
>>>     (np.nan, np.nan, np.nan),  # Park3
>>>     (np.nan, np.nan, np.nan),  # Park3.5
>>>     (0, 42.655039, -73.769048),  # Park4
>>>     (0, 42.657872, -73.764148),  # Park5
>>> ])
>>> posixtimes = X_data.T[0]
>>> latlons = X_data.T[1:3].T
>>> thresh_sec = 250  # seconds
>>> km_per_sec = KM_PER_SEC
>>> X_labels = cluster_timespace_sec(posixtimes, latlons, thresh_sec)
>>> result = 'X_labels = {}'.format(ut.repr2(X_labels))
>>> print(result)
X_labels = np.array([3, 4, 1, 2, 4, 5])

wbia.algo.preproc.occurrence_blackbox.haversine(latlon1, latlon2)[source]

Calculate the great circle distance between two points on the earth (specified in decimal degrees)

Parameters

latlon1 (tuple) – (lat, lon)
latlon2 (tuple) – (lat, lon)

Returns

distance in kilometers

Return type

float

References

en.wikipedia.org/wiki/Haversine_formula gis.stackexchange.com/questions/81551/matching-gps-tracks stackoverflow.com/questions/4913349/haversine-distance-gps-points

Doctest:

>>> from wbia.algo.preproc.occurrence_blackbox import *  # NOQA
>>> import scipy.spatial.distance as spdist
>>> import functools
>>> latlon1 = [-80.21895315, -158.81099213]
>>> latlon2 = [  9.77816711,  -17.27471498]
>>> kilometers = haversine(latlon1, latlon2)
>>> result = ('kilometers = %s' % (kilometers,))
>>> print(result)
kilometers = 11930.909364189827

wbia.algo.preproc.occurrence_blackbox.haversine_rad(lat1, lon1, lat2, lon2)[source]

wbia.algo.preproc.occurrence_blackbox.main()[source]

CommandLine:: ib cd ~/code/wbia/wbia/algo/preproc python occurrence_blackbox.py –lat 42.727985 42.657414 42.658333 42.654384 –lon -73.683994 -73.774448 -73.770993 -73.768919 –sec 0 0 0 0 # Should return X_labels = [2, 1, 1, 1]

wbia.algo.preproc.occurrence_blackbox.prepare_data(posixtimes, latlons, km_per_sec=0.002, thresh_units='seconds')[source]

Package datas and picks distance function

Parameters

posixtimes (ndarray) –
latlons (ndarray) –
km_per_sec (float) – (default = 0.002)
thresh_units (str) – (default = ‘seconds’)

Returns

arr_ -

Return type

ndarray

CommandLine:

python -m wbia.algo.preproc.occurrence_blackbox prepare_data

Doctest:

>>> from wbia.algo.preproc.occurrence_blackbox import *  # NOQA
>>> posixtimes = np.array([10, 50, np.nan, np.nan, 5, 80, np.nan, np.nan])
>>> latlons = np.array([
>>>     (42.727985, -73.683994),
>>>     (np.nan, np.nan),
>>>     (np.nan, np.nan),
>>>     (42.658333, -73.770993),
>>>     (42.227985, -73.083994),
>>>     (np.nan, np.nan),
>>>     (np.nan, np.nan),
>>>     (42.258333, -73.470993),
>>> ])
>>> km_per_sec = 0.002
>>> thresh_units = 'seconds'
>>> X_data, dist_func, columns = prepare_data(posixtimes, latlons, km_per_sec, thresh_units)
>>> result = ('arr_ = %s' % (ut.repr2(X_data),))
>>> [dist_func(a, b) for a, b in ut.combinations(X_data, 2)]
>>> print(result)

wbia.algo.preproc.occurrence_blackbox.space_distance_km(pt1, pt2)[source]

wbia.algo.preproc.occurrence_blackbox.space_distance_sec(pt1, pt2, km_per_sec=0.002)[source]

wbia.algo.preproc.occurrence_blackbox.time_dist_km(sec1, sec2, km_per_sec=0.002)[source]

wbia.algo.preproc.occurrence_blackbox.time_dist_sec(sec1, sec2)[source]

wbia.algo.preproc.occurrence_blackbox.timespace_distance_km(pt1, pt2, km_per_sec=0.002)[source]

Computes distance between two points in space and time. Time is converted into spatial units using km_per_sec

Parameters

pt1 (tuple) – (seconds, lat, lon)
pt2 (tuple) – (seconds, lat, lon)
km_per_sec (float) – reasonable animal walking speed

Returns

distance in kilometers

Return type

float

Doctest:

>>> from wbia.algo.preproc.occurrence_blackbox import *  # NOQA
>>> import scipy.spatial.distance as spdist
>>> import functools
>>> km_per_sec = .02
>>> latlon1 = [40.779299,-73.9719498] # museum of natural history
>>> latlon2 = [37.7336402,-122.5050342] # san fransisco zoo
>>> pt1 = [0.0] + latlon1
>>> pt2 = [0.0] + latlon2
>>> # google measures about 4138.88 kilometers
>>> dist_km1 = timespace_distance_km(pt1, pt2)
>>> print('dist_km1 = {!r}'.format(dist_km1))
>>> # Now add a time component
>>> pt1 = [360.0] + latlon1
>>> pt2 = [0.0] + latlon2
>>> dist_km2 = timespace_distance_km(pt1, pt2)
>>> print('dist_km2 = {!r}'.format(dist_km2))
>>> assert np.isclose(dist_km1, 4136.4568647922624)
>>> assert np.isclose(dist_km2, 4137.1768647922627)

wbia.algo.preproc.occurrence_blackbox.timespace_distance_sec(pt1, pt2, km_per_sec=0.002)[source]

wbia.algo.preproc.preproc_annot module

helpers for controller manual_annot_funcs

wbia.algo.preproc.preproc_annot.generate_annot_properties(ibs, gid_list, bbox_list=None, theta_list=None, species_list=None, nid_list=None, name_list=None, detect_confidence_list=None, notes_list=None, vert_list=None, annot_uuid_list=None, yaw_list=None, quiet_delete_thumbs=False)[source]

wbia.algo.preproc.preproc_annot.make_annotation_uuids(image_uuid_list, bbox_list, theta_list, deterministic=True)[source]

wbia.algo.preproc.preproc_annot.postget_annot_verts(vertstr_list)[source]

wbia.algo.preproc.preproc_annot.testdata_preproc_annot()[source]

wbia.algo.preproc.preproc_image module

wbia.algo.preproc.preproc_image.get_standard_ext(gpath)[source]: Returns standardized image extension

wbia.algo.preproc.preproc_image.on_delete(ibs, featweight_rowid_list, qreq_=None)[source]

wbia.algo.preproc.preproc_image.parse_exif(pil_img)[source]: Image EXIF helper

wbia.algo.preproc.preproc_image.parse_imageinfo(gpath)[source]

Worker function: gpath must be in UNIX-PATH format!

Parameters

gpath (str) – image path

Returns

param_tup -: if successful returns a tuple of image parameters which are values for SQL columns on else returns None

Return type

tuple

CommandLine:

python -m wbia.algo.preproc.preproc_image –exec-parse_imageinfo

Doctest:

>>> from wbia.algo.preproc.preproc_image import *  # NOQA
>>> gpath = ut.grab_test_imgpath('patsy.jpg')
>>> param_tup = parse_imageinfo(gpath)
>>> result = ('param_tup = %s' % (str(param_tup),))
>>> print(result)
>>> uuid = param_tup[0]
>>> assert str(uuid) == '16008058-788c-2d48-cd50-f6029f726cbf'

wbia.algo.preproc.preproc_occurrence module

wbia.algo.preproc.preproc_occurrence.agglomerative_cluster_occurrences(X_data, thresh_sec)[source]

Agglomerative occurrence clustering algorithm

Parameters

X_data (ndarray) – Length N array of data to cluster
thresh_sec (float) –

Returns

(label_arr) - Length N array of cluster indexes

Return type

ndarray

CommandLine:: python -m wbia.algo.preproc.preproc_occurrence –exec-agglomerative_cluster_occurrences

References

https://docs.scipy.org/doc/scipy-0.9.0/reference/generated/scipy.cluster.hierarchy.fclusterdata.html#scipy.cluster.hierarchy.fclusterdata http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.cluster.hierarchy.fcluster.html

Example

>>> # DISABLE_DOCTEST
>>> from wbia.algo.preproc.preproc_occurrence import *  # NOQA
>>> X_data = '?'
>>> thresh_sec = '?'
>>> (occur_ids, occur_gids) = agglomerative_cluster_occurrences(X_data, thresh_sec)
>>> result = ('(occur_ids, occur_gids) = %s' % (str((occur_ids, occur_gids)),))
>>> print(result)

wbia.algo.preproc.preproc_occurrence.cluster_timespace(X_data, thresh)[source]

References

http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/: scipy.cluster.hierarchy.linkage.html
CommandLine:: python -m wbia.algo.preproc.preproc_occurrence cluster_timespace –show

Example

>>> # DISABLE_DOCTEST
>>> from wbia.algo.preproc.preproc_occurrence import *  # NOQA
>>> X_data = testdata_gps()
>>> thresh = 10
>>> X_labels = cluster_timespace(X_data, thresh)
>>> fnum = pt.ensure_fnum(None)
>>> fig = pt.figure(fnum=fnum, doclf=True, docla=True)
>>> hier.dendrogram(linkage_mat, orientation='top')
>>> plot_annotaiton_gps(X_data)
>>> ut.show_if_requested()

wbia.algo.preproc.preproc_occurrence.compute_occurrence_groups(ibs, gid_list, config={}, use_gps=False, verbose=None)[source]

Parameters

ibs (IBEISController) – wbia controller object
gid_list (list) –

Returns

(None, None)

Return type

tuple

CommandLine:: python -m wbia compute_occurrence_groups

Example

>>> # DISABLE_DOCTEST
>>> from wbia.algo.preproc.preproc_occurrence import *  # NOQA
>>> import wbia
>>> ibs = wbia.opendb(defaultdb='testdb1')
>>> verbose = True
>>> images = ibs.images()
>>> gid_list = images.gids
>>> config = {}  # wbia.algo.Config.OccurrenceConfig().asdict()
>>> tup = wbia_compute_occurrences(ibs, gid_list)
>>> (flat_imgsetids, flat_gids)
>>> aids_list = list(ut.group_items(aid_list_, flat_imgsetids).values())
>>> metric = list(map(len, aids_list))
>>> sortx = ut.list_argsort(metric)[::-1]
>>> index = sortx[1]
>>> aids = aids_list[index]
>>> gids = list(set(ibs.get_annot_gids(aids)))

wbia.algo.preproc.preproc_occurrence.compute_occurrence_unixtime(ibs, occur_gids)[source]

wbia.algo.preproc.preproc_occurrence.filter_and_relabel(labels, label_gids, min_imgs_per_occurence, occur_unixtimes=None)[source]: Removes clusters with too few members. Relabels clusters-labels such that label 0 has the most members

wbia.algo.preproc.preproc_occurrence.group_images_by_label(label_arr, gid_arr)[source]: Input: Length N list of labels and ids Output: Length M list of unique labels, and lenth M list of lists of ids

wbia.algo.preproc.preproc_occurrence.meanshift_cluster_occurrences(X_data, quantile)[source]

Meanshift occurrence clustering algorithm

Parameters

X_data (ndarray) – Length N array of data to cluster
quantile (float) – quantile should be between [0, 1]. eg: quantile=.5 represents the median of all pairwise distances

Returns

Length N array of labels

Return type

ndarray

CommandLine:: python -m wbia.algo.preproc.preproc_occurrence –exec-meanshift_cluster_occurrences

Example

>>> # DISABLE_DOCTEST
>>> from wbia.algo.preproc.preproc_occurrence import *  # NOQA
>>> X_data = '?'
>>> quantile = '?'
>>> result = meanshift_cluster_occurrences(X_data, quantile)
>>> print(result)

wbia.algo.preproc.preproc_occurrence.plot_gps_html(gps_list)[source]

Plots gps coordinates on a map projection

InstallBasemap:

sudo apt-get install libgeos-dev pip install git+https://github.com/matplotlib/basemap http://matplotlib.org/basemap/users/examples.html

pip install gmplot

sudo apt-get install netcdf-bin sudo apt-get install libnetcdf-dev pip install netCDF4

Ignore:

pip install git+git://github.com/myuser/foo.git@v123

Example

>>> # DISABLE_DOCTEST
>>> from wbia.algo.preproc.preproc_occurrence import *  # NOQA
>>> import wbia
>>> ibs = wbia.opendb(defaultdb='testdb1')
>>> images = ibs.images()
>>> # Setup GPS points to draw
>>> print('Setup GPS points')
>>> gps_list_ = np.array(images.gps2)
>>> unixtime_list_ = np.array(images.unixtime2)
>>> has_gps = np.all(np.logical_not(np.isnan(gps_list_)), axis=1)
>>> has_unixtime = np.logical_not(np.isnan(unixtime_list_))
>>> isvalid = np.logical_and(has_gps, has_unixtime)
>>> gps_list = gps_list_.compress(isvalid, axis=0)
>>> unixtime_list = unixtime_list_.compress(isvalid)  # NOQA
>>> plot_image_gps(gps_list)

wbia.algo.preproc.preproc_occurrence.prepare_X_data(ibs, gid_list, use_gps=True)[source]

Splits data into groups with/without gps and time

Example

>>> # ENABLE_DOCTEST
>>> from wbia.algo.preproc.preproc_occurrence import *  # NOQA
>>> import wbia
>>> ibs = wbia.opendb(defaultdb='testdb1')
>>> images = ibs.images()
>>> # wbia.control.accessor_decors.DEBUG_GETTERS = True
>>> use_gps = True
>>> gid_list = images.gids
>>> datas = prepare_X_data(ibs, gid_list, use_gps)
>>> print(ut.repr2(datas, nl=2, precision=2))
>>> assert len(datas['both'][0]) == 12
>>> assert len(datas['neither'][0]) == 0

wbia.algo.preproc.preproc_occurrence.testdata_gps()[source]

Simple data to test GPS algorithm.

Returns: Nx1 matrix denoting groundtruth locations X_data (ndarray): Nx3 matrix where each columns are (time, lat, lon)
Return type: X_name (ndarray)

wbia.algo.preproc.preproc_occurrence.timespace_distance(pt1, pt2)[source]

wbia.algo.preproc.preproc_occurrence.timespace_pdist(X_data)[source]

wbia.algo.preproc.preproc_occurrence.wbia_compute_occurrences(ibs, gid_list, config=None, verbose=None)[source]

clusters occurrences togethers (by time, not yet space) An occurrence is a meeting, localized in time and space between a camera and a group of animals. Animals are identified within each occurrence.

Does not modify database state, just returns cluster ids

Parameters

ibs (IBEISController) – wbia controller object
gid_list (list) –

Returns

(None, None)

Return type

tuple

CommandLine:: python -m wbia –tf wbia_compute_occurrences:0 –show TODO: FIXME: good example of autogen doctest return failure

wbia.algo.preproc.preproc_residual module

wbia.algo.preproc.preproc_residual.add_residual_params_gen(ibs, fid_list, qreq_=None)[source]

wbia.algo.preproc.preproc_residual.on_delete(ibs, featweight_rowid_list)[source]

wbia.algo.preproc.preproc_rvec module

wbia.algo.preproc.preproc_rvec.add_rvecs_params_gen(ibs, nInput=None)[source]

wbia.algo.preproc.preproc_rvec.generate_rvecs(vecs_list, words)[source]

Module contents

wbia.algo.preproc.IMPORT_TUPLES = [('preproc_annot', None), ('preproc_image', None), ('preproc_occurrence', None), ('preproc_residual', None), ('preproc_rvec', None)]: Regen Command: cd /home/joncrall/code/wbia/wbia/algo/preproc makeinit.py –modname=wbia.algo.preproc –write

wbia.algo.preproc.reassign_submodule_attributes(verbose=True)[source]: why reloading all the modules doesnt do this I don’t know

wbia.algo.preproc.reload_subs(verbose=True)[source]: Reloads wbia.algo.preproc and submodules

wbia.algo.preproc.rrrr(verbose=True): Reloads wbia.algo.preproc and submodules