ADSubtype

Subtyping patients based on cortical atrophy patterns through the Louvain method
(2015-2017: Last Update: March 10, 2017)

Description

This is a MatLab library for clustering patients with similar cortical atrophy pattern using the Louvain method for modular organization extraction. We used its implementation in the Brain Connectivity Toolbox. Thus, to run our codes, please download it here. Here, we briefly describe the method and how to install and use with a simple example. For the thorough description, please refer the paper below.
(related paper: Robust Identification of Alzheimer’ Disease subtypes based on cortical atrophy patterns, JY Park, HK Na, SS Kim, HW Kim, HJ Kim, SW Seo, DL Na, CE Han, JK Seong, Scientific Reports 2017)
If you find any error or have any suggestion, please contact us. (Jong-Yun Park or Cheol E. Han).

Our analysis consists of three steps: estimating cortical atrophy pattern, computing a similarity matrix between subjects, and clustering with the Louvain method.

Estimating cortial atrophy map
To estimate the cortial atrophy, we use z-scores of cortical thickness normalized by the distribution of cortical thickness in the CN group, instead of using raw cortical thickness. The raw cortical thickness is a snapshot of the current remnants; however, for the cluster analysis of the cortical thinning pattern, it is better to measure the extent of cortical atrophy in each patient instead. Generally, it is not possible to observe the extent of cortical thinning within a single MR image, as we cannot determine the cortical thickness of the brain before AD diagnosis. In this study, we estimated the level of cortical atrophy by normalizing the remnant cortical thickness with the distribution of cortical thickness in the CN group. The standard deviation for each brain region may account for inter-subject variability; in cases where large inter-subject variability resides, even when the remnant is small, it would not be considered as severe cortical atrophy (Figure 1). Since this procedure considers the inter-subject variability, where the large inter-subject variability lies, the resultant z-score is not low even when the subject\u2019s cortical thickness is thinner than the mean of the CN subjects. Thus, in the resultant cortical atrophy patterns, the deep or shallow atrophy is determined not only by the cortical thinning of the patient, but also by the inter-subject variability in the CN group. We believe that the use of this z-score improves subtyping performance.

Computing a similarity matrix between subjects
Our subtyping strategy utilizes correlation coefficients which tends to cluster AD patients into distictive subtypes in which patients share similar cortical atrophy pattern regardless the overall depth of cortical atrophy and thus may perform better than approaches based on Euclidean distance. This superiority arises from the nature of the correlation coefficient itself in that the absolute values of cortical thickness can be automatically controlled in the operation process and thus using correlation coefficients is useful to capture similarities in cortical thinning patterns between the two subjects, rather than the differences in thickness, and is therefore more sensitive in distinguishing the pattern difference.

Clustering with Louvain method
Finally, we employed the Louvain method which was developed for modular organization extraction in network science. The modular organization in a large network can be found by maximizing a value of modularity which is high when the intra-modular connections are dense while the inter-modular connections are sparse. The Louvian method is the current state-of-the-art method for modular organization extraction. In our problem setting, dense inter-modular connections imply high similarity of the cortical atrophy pattern between subjects in a module. The Louvain method is not only efficient for larger networks, but also very accurate; for a few large networks, its value of modularity was the highest among the current common modular organization extraction methods.

important note
Since the Louvain method is based on a greedy optimization method, the clustering results can vary slightly. To resolve this issue, we employed a ‘major voting’ scheme, in which we extracted modular organization N times, and labeled a subject with the most frequently assigned cluster. The Louvain method has a resolution parameter, γ (gamma), which controls the number of clusters.

(OPTIONAL) Visualization of cortical atrophy
We compared the cortical thickness data of subtype group with that of CN using a 2-sample t-test with random field theory using SurfStat (http://www.math.mcgill.ca/keith/surfstat). We also visualized the cortical atrophy pattern using SurfStat. Since our subtyping methods clustered subjects based on the cortical atrophy pattern not the level of overall cortical thinning, the distribution of cortical atrophy of each vertices did not follow the normal distribution. Thus we visualized the median of cortical atrophy as a cortical atrophy map.

How to install

Install Brain Connectivity Toolbox for MatLab.

Download our subtyping package.

Unpack anywhere you prefer.

Launch matlab.

Add the paths where you unpack the library by typing the following in the matlab commandline.
Or you may use the menu of the MatLab GUI.

addpath /home/user/matlab/BCT/
addpath /home/user/matlab/subtyping_louvain/

Download our example data [1] [2].

(OPTIONAL) Install SurfStats Toolbox for MatLab.

How to use

This package has two functions. estimate_cortical_atrophy and community_louvain_major_vote. See below for the example.

estimate_cortical_atrophy

This function estimates cortial atrophy from raw cortical thickness using z-scores of cortical thickness normalized by the distribution of cortical thickness in the CN group.

USAGE: AD_zscore=estimate_cortical_atrophy(AD_thickness, CN_thickness);


	AD_thickness, CN_thickness	thickenss matrix which size is m x n (m: number of subjects, n: vertex of thickness)Their column sizes which mean vertex of thickness should be identical.

	Z score	Z score matrix of thickness to be normalized (m x n matrix; m: number of subjects, n: vertex of thickness)

community_louvain_major_vote

This function extends the Louvain method with the major voting.

USAGE: [cluster_index, [confidence, Q, cluster_all]]

=community_louvain_major_vote(similarity matrix, [gamma, repetition]);


	similarity matrix	similarity matrix between nodes (subjects) (a m-by-m matrix where m is the number of subjects.)γ (gamma)

	γ (gamma)	a resolution parameter. The value lower than one will make the lesser number of clusters which means larger sized clusters (Default: 1).

	repetition	the nubmer of repetition time (Default: 1000).

	cluster_index	The primary output of the function. subtype labels for each node (each subject) (a m-lengthed vector).

	confidence	The value which shows how consistently each node (each subject) belongs to clustered subtype in major voting. (a m-by-repetition matrix).

	Q	modularity value whose maximum value is 1.

	cluster_all	esults of all repeated cluster estimations. This code is a rough implementation. It only works when the modular organization is not quite varying. VISUAL CHECK OF MAJOR VOTING RESULTS IS RECOMMENDED using the following command. imagesc(cluster_all);

cluster_mapping

This function finds correspondence of two modular organization based on the distribution of overlapped subjects.

USAGE: [ score, omap, m_idx ]=cluster_mapping(idx1, idx2);


	idx1, idx2	cell-type variables which contains subjects of each modular organization. For example, idx1={[1,2],[3,4]} means two modules with two subjects each.

	score	accuracy score which represent how well two modular organization are matched.

	omap	overlapping map. m-by-n intersect matrix where m is number of modules in idx1 and n is number of modules in idx2. Each element of (i,j) contains the number of overlapped subjects between the ith module of idx1 and jth module of idx2.

	m_idx	matching table from idx1 to idx2

draw_reordered_similarity_matrix

We reordered subjects in the similartiy matrix to show the resultant clusters. The reordered similarity matrix showes the dense intra-modular connections which implies high similarity of the cortical atrophy pattern between subjects in a module. We marked clusters with black boxes.

USAGE: similarity_reordered=draw_reordered_similarity_matrix(similarity_matrix, Ci);

where m is the number of subjects, the similarity_matrix and similarity_reordered is a m-by-m matrix, and Ci is the cluster index (a m lengthed vector). Subjects in similarity matrix should be equal to subjects in Ci.


	similarity_matrix	a m-by-m matrix where m is number of subjects

	Ci	cluster index (a m lengthed vector).

	similarity_reordered	m-by-m matrix of reordered similarity_matrix

Example

We first estimated cortical thickness of Alzheimer's Disease and normal control (CN) using FreeSurfer. We resampled and filtered the cortical thickness with the Laplace-Beltrami operators following our previous paper. Computing the cortical thickness, resampling and filtering with the Laplace-Beltrami operator is beyond the scope of this example. Instead, we prepared the resampled and filtered cortical thickenss in the example data.

1. Load the prepared dataset (type the following lines in matlab commadline)

load (dataset_name).mat

This contains three variables: Subtype_AD meta structure which have ID and cortical thickness of each hemi sphere, Subtype_NC meta structure which has the same format with AD meta, and cortex_mask. The cortical mask is used to masking the non-cortical parts (e.x. medial wall) out in the analysis. The resampled cortical thickness has 40962 values for each hemisphere. Using the following command, we extract the cortical thickness and mask out the non-cortical tissues.

AD_thk=[Subtype_AD.lb_lh, Subtype_AD.lb_rh];
AD_thk=AD_thk.*repmat(cortex_mask,[size(AD_thk,1),1]);
NC_thk=[Subtype_NC.lb_lh, Subtype_NC.lb_rh] ;
NC_thk=NC_thk.*repmat(cortex_mask,[size(NC_thk,1),1]);

2. Estimate the cortical atrophy by computing Z-score of cortical thickness normalized by the distribution of cortical thickness in the CN group using the following command.

[AD_mwr]=estimate_cortical_atrophy(AD_thk, NC_thk]);

3. Compute Similarity matirx by Pearson correlation coefficients between atrophy patterns of any two subjects.

similarity_matrix=corr(AD_mwr');

4. Subtyping using the computed similarity matrix and the Louvain method.
In this example, as we did in the paper, we set gamma value 0.9 and repetetion of major voting as 1000.

[Ci, confidence, meanQ]=community_louvain_major_vote(similarity_matrix, 0.9, 1000);

5. Draw the reordered simialarity matrix to confirm the clustering results.

draw_reordered_similarity_matrix(similarity_matrix, Ci)

6. (OPTIONAL) Visualize the cortical atrophy map using the following commands.

Visualization example should run after previous subtyping example since it requires the cortical thickness data and cluster index of each subtype. You can get more information about SurfStats here.

a) Load the prepared dataset and perform smoothing over each hemisphere surfaces.

[coord1,tri1]=load_surf_obj('lh.pial.mean.obj');
[coord2,tri2]=load_surf_obj('rh.pial.mean.obj');
tri2=tri2+40962;
surf.tri=[tri1;tri2];
surf.coord=[coord1;coord2]';
Y = SurfStatSmooth( surf.coord, surf, 20 );
s100smooth.coord = Y( :, :, 1 );
s100smooth.tri = surf.tri;
mask=logical(cortex_mask);

b) Compute statistical difference between CN and AD in cortical thickness.
SurfStatLinMod find linear model, at this case two linear models each represent CN and AD, and SurfStatT gives T statistic for CN-AD cortical thickness

thick_NC=NC_thk;
thick=[AD_thk(g_idx{j},:)];
Grp=cellstr([repmat('NC',[size(thick_NC,1),1]);repmat('AD',[size(thick,1),1])]);
glb_grp=term(Grp);
glb_thk=[thick_NC;thick];
contrast=glb_grp.NC-glb_grp.AD;
slm = SurfStatLinMod( glb_thk,(glb_grp) , surf);
slm = SurfStatT(slm,contrast);
Y = SurfStatSmooth( [mean(thick);slm.t], surf, 10 );
slm2=slm;
slm2.t=Y(2,:);

c) Estimate corrected p-values for each cluster using the random field theory (RFT) with SurfStatP. Cyan shows clusters whose corrected p-values is under 0.05, and yellow-red represents p-values of peaks for the clusters

[Pval,peak,clus]=SurfStatP( slm2 ,mask, 0.001);
[~,cb]=SurfStatView( Pval.C, s100smooth, 'corrected P' );

d) Draw an atrophy map which shows medians of the cortical atrophy (z-scores) in each subtype using the following commands (-0.6<=z<=-0.3)

temp_thk=median(AD_mwr(g_idx{j},:));
temp_thka=temp_thk;
temp_thka(temp_thk>-.3)=0;
temp_thka(temp_thk<-.6)=-.6;
[~,cb]= SurfStatView( temp_thka, s100smooth,'Atrophy map' );

SCBIA Lab. Korea University. Republic of Korea.